Unicode Help for Socrates Users
Displaying Characters | Printing/Emailing | Searching
Socrates adopted Unicode in January 2008. Unicode is a widely accepted international standard for the encoding of characters used in the world’s languages and scripts. For more information about Unicode, please visit http//www.unicode.org.
This help page provides information to Socrates
users on how to obtain the best results when using catalog records that
include diacritics, special characters, and information written in
non-Latin scripts when accessing Socrates away from a Library kiosk.
A primary reason for adopting Unicode is to make Socrates a multi-script
system, thus enhancing its effectiveness as a search tool for the library’s
collections. Currently
Socrates includes records that have Arabic and Hebrew script data.
Support for Chinese, Japanese, Korean, and Cyrillic languages is planned for
the future.
Displaying Characters Correctly
In most cases, you don't need to
do anything special for non-Latin characters in a Socrates record to
display correctly in your Web browser. However, if your computer does
not have a Unicode font installed, or has one that is not comprehensive
enough, records containing some characters will be displayed with square
boxes or question marks in place of the correct characters. The solution
is to install a Unicode font that supports those characters.
Socrates’ preferred Unicode font is
Microsoft’s Arial Unicode MS, also called
the universal
font for Unicode in Microsoft documentation. If you are
using Microsoft Windows XP, this font should be installed already.
If not, or if you are a Windows 2000 user, you can install it yourself
if you have Microsoft Office or Microsoft Word.
You can find instructions
for that by:
- Starting an Office application
- Opening Help, and
- Searching
for Arial Unicode MS
Microsoft also has a document that
describes this font and provides installation instructions.
Macintosh
users will need to install the Lucida Grande font.
Printing and E-mailing
If all characters in a Socrates record are displayed correctly in your Web browser (see section above on displaying), and your printer supports Unicode, then all characters should print correctly.
You can email records that include diacritics, special characters, and non-Latin script data from Socrates.
For correct display in email:
- You will need to set your email reader to receive incoming mail in UTF-8 character encoding.
-
You will also need to select a Unicode font (e.g., Arial Unicode MS) for displaying email messages.
-
If you see incorrect (and often strange looking) characters displayed when you open the message, that’s an indication that the character encoding setting is incorrect (i.e., not UTF-8).
- If you see square boxes or question marks in the display, that’s an indication that you are using a font that does not support those characters. You will need to change to a font that does.
Searching
It is not necessary to include diacritics when you enter a search in Socrates for words that have diacritics (e.g., you can type ”etude” to search for the word “étude”). If you copy and paste words that include diacritics to use as your search terms, you can leave the diacritics in. A search with or without diacritics produces the same result.
For certain special characters, you can type a regular character as a substitute (e.g., a regular L instead of a Polish Ł, a regular O instead of a Scandinavian Ø, a regular number instead of a superscript number).
To enter search terms in Arabic or Hebrew
scripts, your PC will need to be configured to support right-to-left
inputting, and have Arabic and Hebrew keyboards installed. For Windows
2000 and XP users, Microsoft provides instructions for adding these
components in the following documents:
SULAIR's public kiosk computers are already set up with
Arabic and Hebrew keyboards, and are configured for right-to-left
inputting.
Please use the Ask Us link in Socrates to submit character encoding related questions and problems not covered by the information above.