PM: Data expo at Green Library

Social Science Data Expo

Friday, October 25, 2019
1-5 pm Jonsson Social Science Reading Room, Bing Wing, 1st Floor

Participating data providers will be on-site to demo their data and answer any data questions that you may have.

Adam Matthew Digital

Stanford Libraries subscribes to 50+ databases of unique archival, primary source collections from archives around the world from Adam Matthew. Content includes topics on Gender to Age of Exploration, UK Foreign Office files to Medieval Family Life, Victorian Popular Culture to Jewish life in America, c1654-1954, American Indian Newspapers to China: Culture and Society. Content is accessible through the databases as well as via an API for text and data mining projects.  

Additionally, Adam Matthew creates many in-house historical datasets based on the primary source content published in their online collections. One example available at Stanford is the ‘convict database’ from Migration to ‘New Worlds’, which presents data on convicts sent from Great Britain and Ireland to New South Wales between 1780 and 1819.

Xavier Snowman, Academic Outreach and Project Development at Adam Matthew Digital, will be available to demo the content that is available to Stanford researchers and discuss how to utilize this data to create your own projects.

The U.S. Census Bureau

Armando Mendoza
Statistics in Schools Program Liaison
Customer Liaison & Marketing Services Office/Statistics in Schools Branch

The US Census Bureau provides access to the world of United States census data. Their primary database is for the Decennial Census, but they also include more targeted databases for the economic census, Bureau of Economic Analysis (BEA), United States Statistical Abstract, and a whole host of other data tools and apps. And don't forget the Census Academy for webinars and workshops for all your data learning needs.

Mr. Mendoza has worked in a number of capacities for the Census Bureau over the past decade, including leadership roles on the 2010 Census and 2020 Census early preparations. Other roles include education efforts to facilitate broad use of Census Bureau data products by providing training, workshops, presentations and Webinars to businesses, educational organizations, community and faith-based organizations on using data tools and apps to access census data from the Decennial Census, Economic Census, and the American Community Survey.

CIDR: Center for Interdisciplinary Digital Research

Center for Interdisciplinary Digital Research Staff

The Center for Interdisciplinary Digital Research (CIDR) enables digital research and teaching to encourage and inspire innovative scholarship throughout the University. We are a team of humanists and social scientists within the Stanford Libraries who design and develop new tools and methods, and integrate technology and information resources, to promote scholarship. Our expertise in data discovery, data creation, data management and analytical tools supports the generation and dissemination of new knowledge.

CoreLogic Data

Helen McMillan, Partner, Data Strategy - Solution Engineer

Rashēd M. Ragland, Sr. Account Executive, University Data & Analytics

CoreLogic is a leading property information, analytics and solutions provider in the United States and Australia. The company's combined data from public, contributory and proprietary sources includes over 4.5 billion records spanning more than 50 years. CoreLogic serves the real estate, mortgage finance, insurance, capital markets, transportation and government sectors. The company, headquartered in Irvine, Calif., helps clients identify and manage growth opportunities, improve performance and mitigate risk. For more information, please visit

Stanford Libraries has purchased and provides access to the following CoreLogic Data Sets*:

  • Loan-Level Market Analytics (LLMA): CoreLogic LoanPerformanceTM mortgage origination and performance data which has been contributed by servicers and spans the life of each loan.  These data include all standard loan origination metrics as well as monthly status and performance updates.
  • Deed Records:  More than 925 million historical real estate transactions from over 3,000 County Clerk/Recorder offices. Content includes but not limited sales, mortgages, and nominal transfers, legal lot, subdivision and developer, document recording information. Property Address and Owner Name elements also included.
  • Property Tax Records (current and historical):  Over 150 million Residential and Commercial property records collected from 99.7% of U.S. County Tax Assessor, Collector and Treasurer offices. Content includes but not limited to assessed, appraised, and/or market values and property taxes , recording and sale date, price, mortgage, lot and living square footage, bed, bath, square footage, fuel and heating types. Property Address and Owner Name elements also included.
  • Pre-Foreclosure Data: Over 29 million property records that contain data on all stages of the preforeclosure and foreclosure process. Find out a property’s foreclosure stage (Pre-foreclosure, REO, or Auction) as well as detailed information about the foreclosure transaction.
  • Multiple Listing Service (MLS) Data: CoreLogic MLS dataset combines data from public, contributory and proprietary sources includes over 4.5 billion records spanning more than 50 years, providing detailed coverage of property, mortgages and other encumbrances, consumer credit, tenancy, location, hazard risk and related performance information. 

*=restricted to Stanford only

Gallup Polling Data

Learn how the data from the Gallup US Daily Tracking and World Polls can uniquely enrich your research.  A variety of time series plots and cross-tabs from these two polls can be viewed via Stanford's subscription to Gallup Analytics.

The Gallup US Daily Tracking Poll, begun in 2008, is a nationally representative poll of about 1500 adults per week. Questions provide unique and detailed insights into Americans' opinions and perceptions of important political and economic issues, as well as the current events that affect the world, the U.S., and their lives.  With almost 1900 variables and large sample sizes that accumulate over time, the US Daily Tracking Poll will allow you to slice and dice the its microdata to fairly detailed breakdowns by demographic categories or geography.

The Gallup World Poll, begun in 2005, is an annual poll of more than 2600 variables from over 160 countries that include 99% of the world's adult population.  The World Poll tracks the opinions and perceptions of the adult population of these countries on the most important issues worldwide, such as food access, employment, leadership performance, and well-being.

ICPSR: Inter-university Consortium for Political and Social Research

ICPSR is a membership-based, not-for-profit organization serving colleges, universities, and other member organizations by providing access to a large archive of social and behavioral science data, training in basic and advanced techniques of quantitative analysis, and resources that facilitate the use of data in teaching. Data holdings are divided into 19 "thematic collections" (including the newest, Firearm Safety Among Children and Teens [FACT]) and the "members archive."  Their site provides a comprehensive list of data, resources, and services, with search tools to help users find what they need.

L2- political data

L2 provides access to a voter file for the United States.  While many companies provide a version of a voter file, L2 focuses on clean and consistent data.  Thus, while their overall dataset only contains about 190 million individuals, they are confident in the accuracy of their product and emphasize quality over quantity. To create this file, L2 processes registered voter data on an ongoing basis for all 50 states and the District of Columbia, with refreshes of the underlying state voter data typically at least every six months and refreshes of telephone numbers and National Change of Address processing approximately every 30 to 60 days. These data are standardized and enhanced with proprietary commercial data and modeling codes and consist of ~190M records nationwide forming the L2 Voter List.  At Stanford, users have access to the bulk files from the states and the District of Columbia, which are updated periodically and represent snapshots in time, as well as the online VoterMapping tool that provides access to the latest releases and allows for the creation of custom subsets and samples.