In early 2012, the Emerging Technologies Team evaluated pen scanners with optical character recognition (OCR) functionality. Charles Fosselman of the East Asia Library suggested this evaluation after he had seen a Korean cataloger from Columbia University Libraries demonstrate a pen scanner for bibliographic record text entry. He wondered if pen scanners would be useful for EAL staff in their work with non-English language materials.
There are many pen scanners on the market that use OCR to convert images to text, but only a few are able to recognize non-Latin scripts (such as Chinese, Japanese, Korean, and Cyrillic) or languages that use diacritics (such as French, Polish, and German). We tested two such pen scanners: the PenPower WorldPenScan Pro and the IRISPen 6. Each claims extensive foreign language support and the ability to recognize 125+ languages.
Charles Fosselman (with input from four East Asia Library colleagues), Linh Chang of the Serials Receiving and Access & Maintenance Units in the Acquisitions Department, and Joanna Dyla in the Metadata Development Unit oversaw the testing and reported their findings to the Emerging Technologies Team.
Testers found that their scanning technique, including angle, speed, and pressure, improved with time and practice. Problems arose with shiny or reflective paper. Font size, color, and line spacing also affected the accuracy of scans.
With the WorldPenScan Pro, accuracy for non-Latin scripts seemed to depend on the variation in and complexity of the character sets. Of the East Asian languages, scanning Korean text produced the best results. Scanning in Japanese, which uses three distinct character sets, was very poor. The WorldPenScan Pro performed relatively well with Latin scripts that used diacritics.
Physical aspects of the pens also significantly impacted scanning. For example, a hard plastic tip on the WorldPenScan Pro caused problems for many testers because it did not allow for a smooth glide along the page. It also prevented any possible testing by Mattie Taormina of Special Collections because applying pressure to delicate materials is a serious no-no. Testers thought the design of the IRISPen 6 was superior in some ways, since it was easier to see the lines to be scanned. Installing this pen was more complex, however, and it proved to be more difficult to use in general.
We were not able to conduct testing on right-to-left scripts, but both the WorldPenScan Pro and IRISPen 6 offer Hebrew recognition as an optional feature for additional cost.
Testers concluded that pen scanners equipped with OCR foreign language support may be useful for technical services tasks such as entering search text for staff that are not proficient in a language or capturing large blocks of text that can then be edited and corrected. It may be less worthwhile, however, for inputting shorter sections of text that can be done by hand relatively quickly.
About Emerging Technologies Team (ETT)
The ETT is composed of SULAIR staff from various departments in the organization, and we are on a mission! Our goal is to identify, test, and assess new and emerging technologies within the academic library environment and disseminate that information to our colleagues.
Our uniquely qualified team is made up of professional technology and library staff who regularly work with new and emerging technologies. We bring this technical and professional knowledge and experience, as well as research and consultation with library staff on new and emerging technologies, and assist them with developing test environments, providing feedback and assessment on those environments, and reporting those assessments to the SULAIR staff at large.
The ETT meets regularly to discuss the feasibility of and develop implementation strategies for identified new and emerging technologies. We obtain information about technology through various tech websites, periodicals, and by word-of-mouth.
We would like to hear from you! If you have any suggestions about any new or emerging technology (hardware or software) that might work in SULAIR, write to us at email@example.com.