Digital Mailroom: What Is Indexing Used When Scanning Documents?

Many people are familiar with document scanning, whether it’s part of a backfile scanning project or for ongoing digital mailroom scanning. But how do you find these documents after they’re scanned?

The lesser known practice of “document indexing” is the answer, but what is it and why should you care?

To begin, there are two types of indexing: metadata and full-text.

Metadata

Indexing serves as metadata (aka keywords) for over document scanning. Typically, our document scanning clients provide a manifest of what needs to be scanned and what types of documents need to expect. We then identify the right mix of metadata that will help to serve as unique identifiers. Metadata indexing examples include:

Invoice, PO, waybill, and work order number
Employee name, employee number and social security number
Student name, ID, school, and social security number
Patient name, doctor and social security number
Date
Site ID
Any other unique identifier

Barcodes can help to automate metadata indexing to eliminate manual data entry (and subsequent mistakes).

Full-Text Indexing & The Role of OCR

Full-text indexing refers to when optical character recognition (OCR for machine print) or intelligent character recognition (ICR for hand print) is used to index all or part (zonal) of documents scanned.

OCR can happen at scan time or post-scanning. The former slows down scanning by 30%. The latter can be done as a function of the document capture system (software that drives the scanner) or in your content/document management system, and is performed faster on a searchable PDF.

Typically, we provide a searchable PDF image for all documents scanned. Document management software like ApplicationXtender (AX) includes a full text OCR capability that populates a database with indexing metadata and adds a pointer so it can be found later.

The only time full-text OCR is needed is when searches beyond metadata need to be conducted, whether its for other types of information like “Seminole County bridge” or if someone is data mining an archive. Otherwise, OCR can add unnecessary cost and may not really be needed.

Still Have Questions?

Give us a call at (800) 956-9000 to learn what you really need to instantly find your electronic documents after they’re scanned. We can also help you build a document manifest and think through the right metadata.

What Is Indexing for Document Scanning & Digital Mailroom?

Metadata

Full-Text Indexing & The Role of OCR

Still Have Questions?

Related Posts

Introducing OpenText™ AppEnhancer Version 22.2 (previously ApplicationXtender)

Maximizing your State and Federal Funding with Income Surveys

5 Things to Consider Before Starting a Scanning Project

What to Look For in a Scanning Partner