| |
OCR and derivative creation
|
OCR and derivative creation
Once the quality control operator confirms that images for a book meet the project’s standards, a set of images is sent to a cluster of workstations that convert page images to editable text using Optical Character Recognition (OCR) technology. The OCR process is entirely automated, and does not involve the use of human operators to correct errors in text conversion. The accuracy of OCR varies greatly with the quality of the original printed page and scanned image. After OCR is complete, derivative files such as image-only PDF, searchable PDF, JPG and ASCII text are created again, via a completely automatic process.
Next »
|