Digital Library Blog

What has the web redesign project been up to in the past three months?

April 20, 2012
by Stuart Snydman

Between January and March of this year the web redesign project took a small break in full-time engineering to focus attention on design work, bug fixing, user testing, and analyzing user feedback. The outcomes of this work have been positive, as we've learned a great deal from students, faculty and staff about how they would use the new site and whether or not it will help them successfully complete their most important tasks.  Many small and large improvements to librarypreview.stanford.edu have already been made, and continue to be added.

New Collections Added to Stanford Digital Repository in March, 2012

April 13, 2012

In March, approximately 2,100 objects representing three collections were accessioned to the Stanford Digital Repository (SDR).

  • R. Stuart Hummel collection: ~ 2,100 items
  • The Life of Saint Catherine, Codex M0381: 1 manuscript
  • Special collection requests: 1 thesis

More details, including links to sample images are listed below.

While many of these objects are already discoverable via SearchWorks others will get SearchWorks records in the coming months. However, all materials are currently available via the item’s PURL (a persistent URL which ensure that these materials are available from a single URL over the long-term, regardless of changes in file location or application technology).

Writing for the web resources

April 12, 2012
by Ray Heigemeir

The Online Experience Group is increasing its focus on enabling content creators to author clear, concise content for the new website. While an exact date is not yet set, technical developments are progressing at a pace that will soon allow content creators to access the site, update existing content and build new pages. In the spirit of laying “fresh eyes” on current content and developing good habits for continual content review and updating, we recommend the following e-resources on writing effective web content.

New Collections Added to Stanford Digital Repository in February, 2012

March 19, 2012

In February approximately 7,000 objects representing six collections were accessioned to the Stanford Digital Repository (SDR), bringing the total number of objects in SDR to nearly 250,000.

  1. Buckminster Fuller collection: 5,200 slides
  2. Kitai topographical maps: 1,600 maps
  3. McLaughlin Maps, California as an Island: 114 maps
  4. R. Stuart Hummel collection: 52 items
  5. Eliasaf Robinson collection addendum: 1 gazette
  6. Islamic prayer book, 1228 H: 1 manuscript

More details, including links to sample images are listed below.

Inclusion in the Stanford Digital Repository ensures that these materials are available to researchers and scholars (while upholding appropriate access restrictions), now and in the future through a secure, sustainable stewardship environment.

While many of these objects are already discoverable via SearchWorks others will get SearchWorks records in the coming months. However, all materials are currently available via the item’s PURL (a persistent URL which ensure that these materials are available from a single URL over the long-term, regardless of changes in file location or application technology).

Indexing MARC records for SearchWorks - navigating Open Source Software

February 16, 2012

The (meta)data underneath SearchWorks is largely based on our MARC records from Symphony. MARC records are exported from Symphony, then slurped up by an application called SolrMarc, which transforms the MARC data into an index for the Solr search engine used by SearchWorks.

SolrMarc is open source software made available by Bob Haschart of the University of Virginia Libraries. SolrMarc is used by all(?) VuFind sites as well as most Blacklight sites built on MARC data (e.g. SearchWorks). SolrMarc has been great for us -- it gave us an enormous jump start for SearchWorks. Bob is also a great guy, and made me a "committer" almost immediately -- so I can make contributions to the open source code.

But.

Open Source Software does best when there is a critical mass of developers: group wisdom rocks, as does sharing the work. To date, SolrMarc is very much Bob's project, despite a number of committers such as myself. There are some ... interesting ... practices as to how SolrMarc is organized and how it is tested. I've even contributed a bit to some of its squirreliness. Occasionally, changes to the SolrMarc codebase break the code I've written especially for Stanford.

New Subject Guide design coming soon!

January 20, 2012
by Ray Heigemeir

The Online Experience Group has been working hard on a proposed new design for subject guides. Subject guides are envisioned as tools to help users navigate a broad or specific subject area and to identify key SULAIR specialists.

We carefully considered how the website redesign would impact the many and varied subject guides. Based on user studies and subject specialist interviews, the proposed subject guide model is intended to provide maximum flexibility for providing content within a visually consistent, branded framework; and to support maximum ease in content creation, organization, and maintenance. The guide model strives for a simple, intuitive design, with support for media, automatic feeds, and custom design within a standard framework.

Six personas (user categories) were developed, each with specific needs. The new design intends to meet the needs of each of these user types:

Stopwords in SearchWorks - to be or not to be

December 16, 2011

We've been examining whether or not to restore stopwords to the SearchWorks index. Stopwords are words ignored by a search engine when matching queries to results. Any list of terms can be a stopword list; most often the stopwords comprise the most commonly occurring words in a language, occasionally limited to certain functions (articles, prepositions vs. verbs, nouns).

The original usage of stopwords in search engines was to improve index performance (query matching time and disk usage) without degrading result relevancy (and possibly improving it!). It is common practice for search engines to employ stopwords; in fact Solr (http://lucene.apache.org/solr), the search engine behind SearchWorks, has English stopwords turned on as the default setting.

In our implementation of SearchWorks, there was no compelling reason to change most of the default Solr settings; thus, since SearchWorks's inception we have been using the following stopword list: a, an, and, are, as, at, be, but, by, for, if, in, into, is, it, no, not, of, on, or, s, such, t, that, the, their, then, there, these, they, this, to, was, will, with.

What follows is an analysis of how stopwords are currently affecting SearchWorks, and what might happen if we restore stopwords to SearchWorks, making every word signficant for every search.

Pages