Often, file indexing and searching are the Achilles Heels of document and enterprise content management systems. Image-based files can easily become lost, with our research indicating that up to 30% of total files stored are invisible to search. Not being able to find a document within your repository means you aren't getting the full return on investment that you should be with these kinds of systems. However, knowing you have a problem is the first step on the road to recovery.
What are the file types invisible to search?
The culprits usually turn out to be image-based documents - JPGs, TIFFs, PNGs and image PDFs. Often, they are scanned invoices or client IDs; email attachments; and documents bulk-imported as part of a merger or acquisition. If these documents don't have OCR technology applied to them, they aren't indexed and remain as image files with no text - becoming invisible to search.
Mobile technology, document ingestion, and staff workarounds have punched huge holes in OCR'ing processes and workflows. This poses significant risks to all kinds of businesses, though perhaps especially so in the legal industry.
Make your files 100% searchable
OCR technology assesses documents, determining whether they are image-based and need to be scanned for text. contentCrawler is one such application; it applies the all-important text layer to image-based documents so that they can be indexed and found by search engines.
Knowing where in the workflow to apply an OCR framework is crucial to its success. contentCrawler is a backend rather than a frontend process that delivers huge benefits in terms of efficiency, searchability, and cost savings. A backend approach to OCR'ing ensures that all documents are made searchable once they are saved into the content repository, irrespective of the entry point.
contentCrawler works in two modes: one monitors newly profiled documents so that they are OCR'ed and made available for indexing immediately; the other OCR’s all the legacy documents in the system.
Request a free audit today to see how many invisible files are lurking in your DMS or ECM.