contentCrawler, our integrated analysis, processing, and reporting framework, now integrates with Microsoft SharePoint Online and is available for immediate download from the Microsoft Azure Marketplace.
contentCrawler searches SharePoint Online libraries for image files such as TIFF and scanned PDFs, even within email attachments. It then OCRs these documents, profiling the resulting searchable documents back into SharePoint with minimal intervention. contentCrawler can process documents 24/7, taking advantage of multithread processing.
A cloud-to-cloud OCR and compression solution
contentCrawler and SharePoint Online are cloud-based solutions, which means processing is faster and more secure as files aren't downloaded to local machines. Operating in the cloud also means on-premise infrastructure is unnecessary. Provisioning of the software is very fast and occurs within minutes. contentCrawler running in Microsoft Azure comes preconfigured and ready to run in Audit mode, providing insight into how much content is static images in your SharePoint Online libraries.
“contentCrawler enhances the searchability of images stored in SharePoint Online. The contentCrawler free audit process enables the CIO to assess the quantity of content in their system that can benefit from this solution,” said Shane Barnett, DocsCorp CTO and Co-Founder.
contentCrawler modules: OCR and Compression
contentCrawler currently supports two services: OCR and Compression. The Compression module will identify documents where a certain level of compression can free up space for other documents to be added. IT Administrators can combine contentCrawler modules into a single, multi-process service for even greater efficiency and productivity. For example, a combined OCR and Compression service would locate all the image-based documents in SharePoint Online, OCR and convert them to smaller, text-searchable PDFs.