You are here

Stanford Digital Repository



In honor of the useR! 2016 Conference taking place this week, we wanted to outline ways researchers can use the Stanford Digital Repository to power their R visualizations.

The Stanford Digital Repository allow Stanford researchers and affiliates to deposit research data for preservation, access, and discovery. Data deposited in the repository is citable and from which the original content can be downloaded. The data is then made available through open web standard services for consumption. For example, images in the repository are delivered by a IIIF-compatible service, geospatial data are served out as Web Mapping Services (WMS) and Web Feature Services (WFS), and generic files are all served through HTTP.

R users can take advantage of these web services and the data being served out.

Water jet with x-ray pulse

When Stanford Digital Repository staff found out someone was depositing research data about using x-ray lasers to explode jets of liquid, I have to admit there was a bit of excitement. Researching explosions (even on a small scale) sounds like an immense amount of fun. But Stanford researcher Claudiu Stan and his colleagues were doing way more important things out at SLAC than just having fun. They were performing serious research into fluid dynamics.

Revolution annuelle de la terre autour du soleil. Compose et dessine par H Nicollet. Le texte de les fig. suppl. par E. Soulier. Paris, publie par J. Andriveau-Goujon, Rue du Bac, no. 17, 1850.

About this series

Over the next few weeks I will post a series of brief step-by-step "how-to" tutorials on making use of digital resources from the David Rumsey Map Center and Collection, that I presented in my "Hacking Rumsey" talk, presented at the opening events for The David Rumsey Map Center, at Stanford University Library.

We're starting small, with the easiest tools (like the David Rumsey Map Collection MapTab Chrome Browser Plug-in, which I covered in a previous post) that appeal to the most people, first. Eventually we will work our way up through more complex use of the collections and tools available from The Stanford University Library.

PURL page screenshot for Nick Eubank's Zambian 2006 to 2010 Constituency and Ward Boundaries

Inquiry from a hot zone

In late March of 2016, Frederic Ham, a geospatial analyst for Medecins Sans Frontieres (MSF, also know as Doctors Without Borders) contacted Stanford University Libraries (SUL) looking for information. He needed data to help him create maps so that MSF could better plan their response to a current cholera outbreak in Zambia. He’d found what he wanted via SUL’s geospatial data portal, Earthworks, but wasn’t able to access it due to licensing restrictions. Was there any way we could help?

Robert Schumann, Drei zweistimmige Lieder (detail)

Rare Music Materials at Stanford is a Spotlight instance that presents materials from the Stanford University Libraries' collections that have been digitized in response to research requests, or were produced for small projects. Items and their downloadable images may also be found in SearchWorks, Stanford's library catalog.

Image of maps created with the use of the Stanford Education Data Archive
Educational opportunity is an important issue in a democratic society. In the United States, measuring educational achievement and opportunity is complex because the public education system is diffuse. Funding for public education depends on a combination of local, state and federal governing bodies. The variations in funding and community level support for public education and standardized testing makes comparisons and analysis across the U.S. an arduous task. 
This is why the Stanford Digital Repository (SDR) deposit of the week is critically important to note. Stanford University Professor, Sean Reardon and his colleagues have just deposited the Stanford Education and Data Archive (SEDA) into the SDR for long term preservation. This is a data set that includes 215 million test scores and tackles the difficulty of comparing test score data from every public elementary and middle school in the United States for a period of 5 years, (2009-2013). What's brilliant about this collection of data is that, Reardon and his team developed a method to equate the scores across states for comparison enabling a whole new set of questions on educational opportunity to be answered, new stories to be told, and new questions to be raised.
Screenshot of Disputed Boundaries data set

It's one thing to talk about an area of land under dispute, and it's another thing entirely to see it on a map. Professor of Political Science Kenneth Schultz demonstrates the validity of this statement with his recent work, "Mapping Interstate Territorial Conflict," which was published in December in the Journal of Conflict Resolution.

Hatef Monajemi

Many scientists are making the reproducibility of their research a much higher priority these days than they used to. But it's a time consuming task, which means that many are searching for tools and workflows to help facilitate their efforts.

Hatef Monajemi, a PhD student in Civil and Environmental Engineering, and his PhD advisor Professor David L. Donoho, have developed a new piece of software that can make reproducibility an easier goal to achieve. His new software is called Clusterjob (CJ). This software can be used to develop reproducible computational packages and make the generation of data for a research study fully reproducible. CJ is an open-source software available on GitHub.