Imaging: Scanning

Overview Case Studies •  Scanning •   IndexingSecurityStorageRetrievalRequirements •  Questionaire

The most common form of input to a document imaging system is scanned paper. This can be done in several ways. The document may be single page or multiple pages. Scanning is a process of converting a paper document into a series of ones and zeros that faithfully represent the original document. Automatic document feeders built into the scanner move pages in sequence to the scanner anywhere from 12 pages per minute to 100's of pages per minute. Scanners are rated by speed (pages per minute), resolution (lines per inch i.e. 200, 300, 400) format (color, gray scale, black/white) and page layout (double sided, single sided, standard, legal, tabloid or larger). The pages are processed by light from a sensor which converts the image into electronic "ones" and "zeros". The interface used to transfer the "ones" and "zeros" is usually SCSI but may be video as well. Several additional processes can be added to scanners or their interfaces to enhance the scanned image. These process enhancements can include color drop out lamps, deskew, despeckle, continuous contrast adjustments, thresholds and bar code recognition.

VirtualReScan-- The Next Evolution in Production Scanning

VirtualReScan (VRS) is a powerful combination of hardware and intelligent software that allows scanners to do what they always were meant to do: produce perfect images every time. VRS dynamically corrects documents that are highly skewed and even auto compensates for poor contrast. And that's just the beginning.

Source documents are scanned into queues via any TWAIN or KofaxTM compliant scanner as black and white, gray scale or color documents. Prior to scanning, users assign a scan format to a scan queue. Scan formats are user configurable and contain information about scanner settings (paper size, contrast, etc.), file formats (Group IV, TIFF, JPEG, PCX, etc.), and document setup (duplex emulation, rotate on scan, etc.), as well as indexing information. These formats retain specific settings for different types of source documents making scanning faster and easier. After selecting a scan format, a user scans documents into the scan queue where it can then be verified and identified.

Scanning Equipment

Inexpensive scanners cost under $300.00 dollars but are meant mainly for the home graphic market. The interface is usually a parallel port adapter and the image output is very slow at a high resolution. Resolutions and speed in the imaging world start at 200 dots per inch (DPI) and speeds of 12 pages per minute (PPM). At the higher end are scanners that top 200 pages per minute on two sides (duplex). These impressive scanners adjust for background paper color, streaks, lines, double feeds and page noise. Some of the added features of a high end scanner might be internal page counters and self contained printers that print a word or control number on each page to prove it was processed before filing or shredding. Paper documents are usually 8.5 by 11 inches. Most scanners can scan a legal size page but many can scan a wider page. If you should ever need tabloid size, be aware that your file sizes and scanning speeds have to be calculated accordingly. Scanner speeds are calibrated to a standard 8.5 x 11 page scanned at 200dpi. If the scanner is rated at 40 pages per minute, that rate will drop to 30 pages per minute or less as the resolution goes up.

Factors that affect the scan time or image quality:

• Document Condition- Affects on-screen image quality
• DPI of scan- Detail to capture (usually 200 DPI)
• Scan depth- Number of grayscale levels (usually 2)
• Document size- Size of original document

Scanning Procedures

Scanning is usually done at a dedicated imaging workstation. Scanning is mostly done in batches, which are groups of document images. When using a sheet-fed scanner with an automatic document feeder, a batch of documents can be automatically input and scanned.

The required skillset includes feeding the documents into the feeder, managing document jams, and reviewing the scanned image. To avoid unnecessary complexity in keeping multi-page documents in order, operators should handle responsibility for the entire batch, during their shift.

Scanner jams

To prevent scanner jams, documents should be looked over for any tears or previously folded edges that may catch in the feeder mechanism.

Condition of the document

The condition of the document determines how well the image of the document will be captured on the computer screen. Software can adjust the darkness and contrast for a document. By comparing the captured document information with information from a database, the low-quality documents can be flagged for rescanning at a later time. Acceptable quality documents will have been verified with the existing database information.

Resolution of the document

The majority of documents are scanned at 200 DPI. 200 DPI allows the scanner to produce an acceptable image of a document. Line art or CAD drawings may require 300 DPI or 400 DPI.

Scanning depth

A B&W document image, which is produced by scanning at 1-bit, is acceptable for most text documents. Grayscale document images are produced by using color depths of 8-bit to 16-bit. Documents scanned at higher color depths require more storage space.

Size of documents

Non-standard size documents must be scanned with their own special format scanner.

Handling color-coded documents

Some organizations use color in their documents to mark-up or encode information. The handling of color-coded documents should have already been done in the document preparation stage.

Imaging sensitive materials requires security >>