Library info; guides & content by subject specialists
Results include
  1. Hopkins Marine Station Student Papers

    From 1963 - 2011, Hopkins Marine Station offered Biology courses 175H or 176H. Students in these courses developed and conducted research projects in the area around the station, and the culmination of each of their efforts was a final paper. Copies of these papers were deposited in the station’s library, and we now have over 750 undergraduate research papers in our collection. These student research reports contain observations of environmental conditions, species and populations recorded over a span of nearly 60 years, and provide an extremely valuable corpus for conducting historical ecology research.Exploring computational methods for extracting biodiversity data from student papersComputational methods for text analysis are rapidly evolving, and initial testing on a subset of the student papers corpus demonstrates significant potential in this area. Almost without exception, the student reports were typed, which supports effective optical character recognition on digital surrogates. Plain-text versions of the reports can then be analyzed with existing Natural Language Processing (NLP) tools. For example, spaCy is a Python library for NLP that “excels at large-scale information extraction tasks.” Our partners in Stanford’s Center for Interdisciplinary Digital Research used spaCy to automate identification of genus and species names in student reports using a process called named entity recognition (NER). This process compared the complete World Register of Marine Species (a list of nearly 600,000 species) to the text of student papers to identify named species by date and location (e.g., Anthopleura elegantissima (an anemone) at Hopkins Marine Station, June 1959). An observation of an organism at a given place on a given date is called a “species occurrence.” Species occurrence data forms the foundation for biodiversity research. The critical nature of species occurrence records is evinced by the Global Biodiversity Information Facility (GBIF), an international research network which currently holds nearly 1.4 billion occurrence records in an open database. While the size of the GBIF database is remarkable, a deeper examination of the data reveals a serious limitation. When GBIF species occurrence records are viewed over time, a rapid decrease in the number of observations is apparent as you go backward. Sources of observational data that can fill these gaps in the record are indispensable. This is one area where the student research papers in our collections are poised to make an important contribution, if we can find a way to extract the relevant data from the physical corpus. If you are interested in working with the Hopkins Marine Station student papers, please contact Amanda! 

    1. Collections
  2. The Arabidopsis Information Resource: A curated reference resource for translational plant biology

    Arabidopsis thaliana has been the object of intensive study for more than half a century and was the first plant genome to be fully sequenced. Since 1965, approximately 56,000 research articles have been published about Arabidopsis.  In recent decades, extensive suites of experimental tools (e.g. mutant stocks, sequence variation libraries and various ‘omics’ data) have been generated by the research community for probing the functions of the more than 30,000 genes. Founded in 1999, the Arabidopsis Information Resource (TAIR, www.arabidopsis.org), employs trained biocurators who extract published experimental information from the literature and integrate data to present a comprehensive view of Arabidopsis gene function. With sustained support from the research community, TAIR is able to continuously add data to produce a ‘gold standard’ annotated plant genome that serves as a key reference species for translational biology. Curators from TAIR will demonstrate how to use data and tools in TAIR, including integrated orthology resources and literature-based annotations of gene function and expression patterns, to infer the function of unknown genes in Arabidopsis and species with economic and agricultural importance. There will be an opportunity after the presentation for one-on-one consulting.

    1. About
    2. Workshops
  3. TAIR Workshop

    Arabidopsis thaliana has been the object of intensive study for more than half a century and was the first plant genome to be fully sequenced. Since 1965, approximately 56,000 research articles have been published about Arabidopsis.  In recent decades, extensive suites of experimental tools (e.g. mutant stocks, sequence variation libraries and various ‘omics’ data) have been generated by the research community for probing the functions of the more than 30,000 genes. Founded in 1999, the Arabidopsis Information Resource (TAIR, www.arabidopsis.org), employs trained biocurators who extract published experimental information from the literature and integrate data to present a comprehensive view of Arabidopsis gene function. With sustained support from the research community, TAIR is able to continuously add data to produce a ‘gold standard’ annotated plant genome that serves as a key reference species for translational biology. Curators from TAIR will demonstrate how to use data and tools in TAIR, including integrated orthology resources and literature-based annotations of gene function and expression patterns, to infer the function of unknown genes in Arabidopsis and species with economic and agricultural importance. There will be an opportunity after the presentation for one-on-one consulting.

    1. About
    2. Workshops

Exhibits

Digital showcases for research and teaching.
No exhibits results found... Try a different search
Geospatial content, including GIS datasets, digitized maps, and census data.
  1. Invasive Species, 2007-2008

    National Center for Ecological Analysis and Synthesis
    2007

    This raster dataset contains measurements of invasive species in the world's oceans. See the Lineage section for complete information regarding dat...

  2. Data and Scripts for manuscript "Improving predictions of range expansion for invasive species using joint species distribution models and surrogate co-occurring species"

    Briscoe Runquist, Ryan D., Lake, Thomas A., and Moeller, David A
    2020

    Click "Visit Source" to download CSV data.|This data can be used to replicate the results from the manuscript "Improving predictions of range expan...

  3. Massachusetts (Priority Habitats of Rare Species, 2003)

    MassGIS (Office : Mass.)

    The Priority Habitats of Rare Species datalayer consists of polygons that represent estimations of important state-listed rare species habitats in ...

More search tools

Tools to help you discover resources at Stanford and beyond.