Automated GitHub archiving with the Stanford Digital Repository

The Stanford Digital Repository (SDR) team is pleased to announce a new automated workflow for archiving releases of GitHub repositories. The new workflow allows researchers to meet best practices for research software preservation by automatically depositing GitHub repository releases in the SDR and assigning them DOIs for citation. The straightforward process requires no action by the depositor after the initial setup in order to archive future releases.
Any public GitHub repository can be set up via the new workflow for automated archiving in the SDR. We collect information about the GitHub repository and pre-select options in other metadata fields to streamline the process. Once setup is complete, we check nightly for new releases of the GitHub repository and deposit any that we find as new versions of the same work with the same DOI. The user need never return to the SDR web application to deposit future releases. Watch our short video of the process.
“Most of our research projects generate large experimental datasets that depend on substantial custom analysis code,” said Colin Ophus, Associate Professor of Materials Science and Engineering and Center Fellow at the Precourt Institute. “The Stanford Digital Repository, with GitHub integration, gives us a practical way to archive and share the data and software together in a single location.” Deshan Perera, a Postdoctoral Scholar in Biology, has already used the new GitHub workflow and was pleased with the straightforward process. “It streamlines archiving while also adding credibility to our projects by making them accessible via the Stanford Digital Repository.”
Members of the Stanford community provided invaluable feedback during the development of this new workflow. Community needs and requirements were incorporated into the design to produce a streamlined process that addresses several issues with archiving methods currently employed by campus researchers. The new SDR workflow provides depositors with the DOI before the GitHub repository is archived, so that it can be included in the documentation files that are part of the release. SDR depositors are also able to make corrections to metadata without generating a new DOI, which can cause problems for persistent citation. The SDR also allows for depositors to provide complete metadata, including specifying authors along with their ORCID iDs and institutional affiliations.
Read about our other new workflow for depositing previously published articles.
If you’re a Stanford author interested in using our new automated GitHub workflow, please contact us via our web form.