In May, the creators of a new, unique data mining tool -- Enigma -- made a presentation to a group of Political Science Department graduate students. It would be safe to say that the demonstration generated some real interest and excitement. Based in great part on that response from students in the department, the Library has now arranged a long-term beta-test with Enigma for the entire Stanford community. The only other academic institution with this arrangement is the Harvard Business School.
Enigma has ingested, and continues to ingest, digitized public records from the federal government, state governments, international organizations, and from some non-U.S. governments. They began amassing data by making a FOIA request for all U.S. government domain names and then scraped and ingested all of the data available on those sites. They continue to ingest new data every week.
Enigma is very interested in engaging with Stanford scholars to get ideas for additional digital data sets to include in their database. In other words, they really want to hear from you about data that might be valuable to your research. So, as you use Enigma, please take advantage of the Chat function to make suggestions/requests. One caveat -- Enigma does not digitize data. However, they are pretty inventive in finding ways to obtain digital data that should be in the public domain. For example, they take in U.S Customs Service data regarding all containers that pass through U.S. ports. This data is only made available by the Customs Service on CD's, which Enigma uploads on a weekly basis.
Enigma can be accessed by all members of the Stanford community through this link:
Here is Enigma's own description of what they are trying to accomplish:
Enigma is a search and discovery platform for big public data that exposes billions of public records across previously siloed datasets. Petabytes of pubic data are created by governments, companies, and independent institutions each year. However, as many of us know, it is tedious (if not impossible in some cases) to navigate and discover connections across these disconnected resources. Enigma empowers its users to search and manipulate these hidden datasets, creating priceless information needed to gain an edge and uncover a universe of untapped knowledge. Whether you are searching for people, companies, places, social, political, and economic trends, or broader topics, Enigma offers depth and resolution into these pools of data that are currently unavailable or underutilized by traditional search portals like Google, Yahoo, and Bing.
Let us know what you think of this new and powerful statistical tool.