DocsCorp releases contentCrawler 2.1; faster processing, easier administration and automated reporting

DocsCorp, (www.docscorp.com) a global leader in document productivity software for enterprise content management systems, today announced the release of contentCrawler 2.1—the newest version of its integrated analysis, processing and reporting software that provides document management professionals with the peace of mind of knowing that their content is 100% searchable.

Used by a wide variety of companies such as Marshall Dennehey, Cuatrecasas Gonçalves Pereira, Hugh James and the Law Society of British Columbia, contentCrawler’s versatile automated end-to-end process intelligently examines image-based documents in a content repository and converts them to searchable PDFs, making them available to search technologies for indexing. The contentCrawler 2.1 release includes several usability and performance enhancements and improvements.

“I upgraded directly with no problems.  Former documents were transferred to the new version with no problems at all,” reported Pedro Monteiro, Support at Truewind-Chiron.  “This new version is faster, more informative and much lighter process wise to the server.  Documents are assessed and saved 10x faster comparing to the old version we had.”

Faster processing

Multi-OCR processing - contentCrawler takes advantage of faster processing using multi-threading to optimize support for 4, 8, 16 and 32 CPU cores. For example, with 4 CPU core processing, contentCrawler will be able to OCR 1 page per second, or 85,000 pages per day. This represents a significant improvement over other OCR solutions and remains unique in its ability to OCR documents already stored in a DMS. 16 CPU core processing will be capable of OCR'ing 4 pages per second, or up to 350,000 pages per day!

File type filters - new file type search filters provide users with greater control over document types that can be processed. Users can exclude certain document types from the search to decrease processing time, including those saved as email message attachments.

Easy administration and reporting

Set up Service email notifications - Users can establish various email notifications to report on the progress of the crawl and request that the Service Statistics and Error reporting be emailed to them.
Monitor progress status - Users can instantly see the progress status of individual documents being processed at the OCR stage. This information is displayed to the user as a percentage.

Document information display - Provides document information such as total page number and size of documents being processed, including an overall total size of documents requiring OCR.

Configurable Multilingual OCR - Users can easily configure multilingual OCR’ing across all services. contentCrawler supports over 180 languages.
Export Report - Users can export processing reports as CSV files for analysis and review.

Configurable minimum disk space limit - Users can specify minimum free space threshold for document cache directory.

20% of documents in content repositories are invisible to search

contentCrawler was developed to address the very real and serious issue of non-searchable content in enterprise content management systems. More than 20% of documents in a content repository are "invisible" to search technology.

These documents are often profiled as a result of ingestion of legacy or litigation documents, saving emails with attachments, mobile technology and employee workarounds that bypass the OCR'ing process. Failure to produce documents on demand impacts the bottom line, workplace efficiency, regulatory compliance, and productivity, and exposes an organization to unnecessary risks.

Download the contentCrawler 2.1 trial to see how much non-searchable content is in your content repositories. Email info@docscorp.com for more information.

contentCrawler integration

contentCrawler integrates with HP Autonomy WorkSite, HP Records Manager (formerly HP TRIM), OpenText eDOCS DM, ProLaw, MS SharePoint as well as MS Windows file systems. Integration with OpenText Content Server and Worldox will be available soon.

Media contacts

Kerry Carroll
Global Marketing and PR
+61 (0)2 8270 8500
kerry.carroll@docscorp.com

Melody Easton
EMEA Marketing Manager
+44 20 7084 6270
melody.easton@docscorp.com

Corinne Tippett
North America Marketing Manager
+1 (503) 406 2575
corinne.tippett@docscorp.com