Blog by Sumana Harihareswara, Changeset founder

22 Mar 2021, 21:09 p.m.

A Spec For A Sandboxed Open Source Project Environment

I'm writing a book on maintaining legacy open source projects to help teach people vital skills. Right now, as far as I know, there's no textbook or course you can work through to learn skills like assessing an open source project systematically, triaging bugs, noticing quiet-but-promising contributors to promote, improving code review processes, writing a grant proposal, and finding your successors. Or, more accurately, there are courses and guides that cover different software leadership areas, but there's nothing that covers the whole toolbox.

When I'm teaching skills, I want to give learners exercises they can use to develop and practice those skills. And if I turn the book into a course, I'll want to be able to assign those exercises and review their homework. So I started thinking: how nice it would be if I could snapshot or composite together a sample legacy FLOSS project, complete with messy old issues, docs, and chat and list archives, and replicate it in self-serve sandbox instances for exercises!

I've been thinking about this for a while. Today I wrote up a spec. I don't know what to call it - "Maintainer Sandbox", "Snowglobe Factory", and "Diorama" all sound good. The spec below assumes that I'd be leading a cohort of learners through a semisynchronous online course; it would work fine for an in-person class as well, but I'd have to adapt the access levels part for a completely self-paced and self-driven course.


As instructor, I would create a snapshot of a sample open source project (comprising materials listed below). The hosting platform would replicate it in self-serve instances, usable over the web, for user exercises. Upon signing up for a course, a user would get access to a freshly provisioned instance, complete with project history.

Materials: Each instance would include the following artifacts, all browseable and searchable via the web browser:

  • Bugtracker-type artifacts and history that would generally be available as part of a GitLab or GitHub project: a git repository for code history, past and present issues and patches/pull requests/merge requests (complete with tags/labels, milestones, assignees, and similar metadata), past release announcements, and a wiki
  • Project documentation and overview information for users and for developers, as would generally be available on a project's website (even if documentation source is available in the git repository, the snapshot we provide should include the docs as rendered into HTML)
  • Archives of the project's mailing lists, as would generally be available as a Mailman, Majordomo, or similar instance, browseable and searchable via a web interface like this web view of python-dev (a Mailman 3 list)
  • Archives of the project's chat conversations, using a public Zulip chat history like this public archive of Rust's Zulip chat or a browseable history of Internet Relay Chat conversations via a web interface like this web view of Freenode's #pypa logs

(Olivier Lafleur conversed with me on Twitter about how to do some portion of this using GitLab. Also, the Perceval project may be a good tool to consider for mailing list and chat archives.)

Access privileges: The learner would not only interact with the example project materials as a reader but also as a participant, moving through the three access levels described below. The instructor would be able to view a learner's instance, with administrator (Level 3) access, to assess the learner's actions.

  • Level 1: The learner starts with the same user privileges that a new user would ordinarily have -- they can read all the public mailing lists, chat histories, wiki, and bugtracker items, and file bugs.
  • Level 2: The instructor can promote the learner to the second access level, at which point they can (for instance) triage, label, and close bugs and pull requests/patches, edit the wiki, and post to public mailing lists.
  • Level 3: The instructor can promote the learner to the third (administrative) access level, at which point they can (for instance) browse and post to private mailing lists and chat channels, and use maintainer powers on the git repository (such as merging pull requests/patches).

Authentication: I imagine that dealing with authentication within the application could get sticky. My preferred approach:

  • As a signed-in platform user, the learner is automatically logged in to all the web applications within the instance, and cannot log out.
  • It is not possible to view an instance unless you are either the learner or instructor associated with that instance.

Multi-user access: Ideally, it would also be possible for an instructor to expand access so that a project can have multiple users (in other words, give Learner A access to Learner B's project instance), so that learners can engage in peer learning and group exercises. However, I expect that would lead to a much larger range of headaches, including Code of Conduct problems in interactions between learners, so -- in earlier versions of this tool -- I am fine with not having this functionality, and instead advising pairs and small groups to use screen sharing for group exercises.

End of life: At the end of the course, each instance would turn read-only for two weeks (to allow the learner to make notes, and to make local copies of any work they had done), and then the platform would delete the instance.

Thoughts? In particular, if you know of software that already does more than half of what I want, I'd like to know about it.


23 Mar 2021, 20:57 p.m.

Hmmm. Not sure how easy it would be to pre-populate, but I found Gitea ( ) very easy to spin up, and they appear to have Docker instructions (in which case you could probably set up once and then snapshot the docker containers). Solves some of the individual instancing/EOL stuff, at least!