contentCrawler cloud For Bulk Image Conversion
contentCrawler cloud intelligently assesses documents in a document management system for bulk processing. It is a multithreaded automated solution that runs 24/7 without intervention. There is no need for any other OCR’ing or compression hardware or software.
By processing directly in the document library there is no impact on staff workflows or processes. Staff continue to upload documents into the document content repository without worrying about OCR as a process or a workflow.
contentCrawler cloud is powered by Microsoft Azure and currently integrates with iManage Cloud, Microsoft SharePoint and NetDocuments. These are also cloud-based solutions.
- Assesses and analyzes documents in a content repository for OCR and/or compression processing
- Processes image-based documents such as TIF, JPG, PNG and image PDFs
- Converts image-based documents to text-searchable PDFs adding a text layer for enhanced searching
- Reduces image-based document file size using a variety of JPEG compression standards
- Processes image-based attachments in emails
- Set compression and text thresholds to optimize processing, ignoring documents that do not meet the requirements
The contentCrawler OCR module converts image-based document to text-searchable PDFs, saving them back into the Content Repository as new or replacement documents--ready to be indexed and found.
The contentCrawler Compression module compresses image and PDF documents. Converting image documents to PDF and applying compression and downsampling to the files will reduce overall file size.
IT Administrators are able to combine the OCR and Compression modules into a single service.
Cloud to Cloud
Running on Microsoft Azure, contentCrawler integrates with Microsoft SharePoint Online and NetDocuments. These are also cloud-based solutions.
- Processing is faster and more secure since files are never downloaded to local machines.
- Operating in the cloud also means no on-premises infrastructure is necessary.
- Provisioning of the software is very fast and occurs within minutes.
- contentCrawler on Microsoft Azure comes preconfigured and ready to run in Audit mode, providing insight into how much non-searchable content exists in your Document Management System cabinets and libraries.
Trial Audit Mode
contentCrawler can be run in Audit mode to show you how much non-searchable content is in your content repositories. This will provide you with the numbers to build the business case to solve the problem. The audit will run for 48 hours.
contentCrawler cloud Insights
Save up to 240 hours a year per person in lost productivity looking for missing or invisible documents
contentCrawler can run on 4, 8, 16 or 32 CPU cores for faster processing. OCRs 2 pages per second on an 8 CPU core
contentCrawler finds 30% more documents than your document management search technology
Save up to 120 hours per year per person OCR’ing documents
Run fully-automated OCR processing 24/7, with no staff intervention needed
Can OCR up to 17,000 pages per day
Over three million documents OCR’ed
Josh Schreiner, Workman Nydeggar IT Director, explains how contentCrawler automated the process of OCR’ing over three million legacy documents in iManage Work to make them 100% searchable.
In a profession that is overwhelmed by paperwork, it’s not unusual for some of it to end up lost in a document management system. contentCrawler is our data discovery solution that helps users find files that normal search technology cannot. This ensures our users comply with requests from clients and tax authorities to hand over specific documents or face harsh penalties.
Large volumes of documents are created and managed in the Financial Services industry. contentCrawler ensures all documents can be found and produced when regulations require it. This includes everything from financial statements; contracts and agreements; credit profiles; loan agreements, and contracts.
Additionally, when users have full access to files, they can analyze all available data to identify customer characteristics and determine the best offers to present to prospects.
Government employees manage applications, licenses, certificates, reports, contracts, tax documents, and more every single day. contentCrawler ensures government departments are compliant with regulations by making all documents discoverable though search.
Governments have to comply with legislative agreements around how information is processed and delivered; minimum response times to information requests; and non-disclosure of private or confidential information.
Law firms require fast and reliable access to documents to be both productive and diligent in the advice they give. Failure to find documents can have serious implications; reputational and financial damage as well as conflicts of interest. contentCrawler ensures legal professionals find the documents they need.
Since much of the work is regulated by government and industry bodies, Life Science companies need to be able to produce documents on demand. Failure to do so can lead to serious fines and penalties. contentCrawler ensures all documents are 100% searchable and retrievable, reducing the risk of non-compliance or lost productivity looking for lost or misfiled documents.
Resources and Energy
Large engineering and resources projects can involve hundreds of thousands of files; including drawings, operations documentation, equipment specifications, user manuals. contentCrawler ensures all these documents are retrievable.
Since many of the documents will be image-based documents, they are “invisible” to search engines. Failure to find critical documents can have serious impacts on these projects.
What Our Customers Say
"Every time we ran a search more than 1/3 of the documents would not be returned. This was an issue for us going forward with the scanning project."
IT Director, Hugh James
"There were no new processes or staff training required. Everything just worked in the background. Staff members were completely unaware of any changes other than the fact that more documents started to show up in the search results."
System Engineer, AJ Park
"We can control the OCR (optical character recognition) workflow on documents generated internally, but there was no tool or workflow to automatically capture and convert image-based documents from outside sources and profile them into iManage."
Manager of Application Services, Marshall Dennehey Warner Coleman & Goggin
Local Support. Global Reach.
We have support teams based all over the world to assist you with any questions or difficulties you may be experiencing. Support is available to our users 24 hours a day, 5 days a week to ensure we can get you back up and running as soon as possible.
You can submit a support ticket online through the Resource Portal, contact us directly via email or phone, or chat with us on social media.Request Support
DocsCorp is a leading provider of productivity software for document management professionals. Our offices and products span the globe with over 500,000 users in 67 countries. Our clients are well known and respected global brands that rely on our software every day.