contentCrawler helps find critical documents and drawings, ending engineers’ frustration with DMS

Published on December 18, 2014 by kerryc
PTTEP AA is a wholly owned subsidiary of PTT Exploration and Production (PTTEP), the Thai national petroleum exploration and production company. In Australia, PTTEP AA is the operator of the producing Montara oil field and the Cash Maple gas field in the Timor Sea. PTTEP AA employs more than 300 people based in Perth, Darwin and the Timor Sea. The Montara oil field is located in the Timor Sea 180km from the north Kimberley coast off Western Australia. The development includes an unmanned four-legged well-head platform and the Montara Venture, a Floating Production Storage and Offloading (FPSO) vessel with up to 850,000 barrels of storage. The business challenge PTTEP AA had commissioned the construction of the Montara Venture FPSO vessel, awarding contracts to various suppliers. Documents from contractors and suppliers were stored in a leading document management system. All in all, there were approximately 500,000 documents relating to the project. To access documents, drawings or information relating to the project, engineers would enter a part or tag number into a search field in the document management system and be presented with all the relevant documentation. “At least that was how it was supposed to work,” recalled Trina Ireland, PTTEP AA Information Management Team Leader. “We very quickly discovered that many of the documents supplied by vendors, contractors and subcontractors were in fact image-based documents - JPEG, TIFF, PNG and image-based PDFs.” These types of files cannot be indexed since there is no text, they are essentially like pictures and were in effect “invisible” to the document management system index engine. The engineers were getting more and more frustrated with the system to the point they were starting to lose confidence in it. They turned to PTTEP AA’s Document Controllers to find the documents they needed, which led to inevitable delays. “We looked to our document management system consultants for a solution. They recommended contentCrawler from DocsCorp,” explained Trina. contentCrawler is an integrated analysis, processing and reporting framework. It intelligently assesses image-based documents in a content repository for batch conversion to text-searchable PDF documents, which can be saved back into the content repository as a new version or as a replacement for the original. Converting image-based documents to text-searchable PDFs can be an automated end-to-end process or a manual one with built-in “Hold for Review” stages. Equally, processing can run in one of two (or both) modes: Convert Backlog (legacy documents) or Active Monitoring (just profiled). Our solution PTTEP AA set up a development lab to test contentCrawler. Non-searchable documents were added to the library as control documents. Testing was conducted over a period of 1 month, at the end of which PTTEP AA decided to deploy contentCrawler on the production environment. The document management system environment consisted of 3 libraries. It was decided to run contentCrawler on each of them in turn to address the backlog issue. Once the backlog was complete, they switched to Active Monitoring mode for newly-profiled documents. contentCrawler can run in both modes simultaneously. PTTEP AA ran contentCrawler as an automated process, replacing the original with a text-searchable PDF. Trina recalls “it really was a set and forget operation – it just worked in the background with little or no intervention from the team.” In addition, IT Administrators can install and configure contentCrawler from the centralized monitoring and reporting dashboard. Administrators can also set up various email notifications to report on the progress of the crawl, requesting the Service Statistics and Error reporting be emailed to them. They can export processing reports as CSV files for analysis and review. While solving the problem with contentCrawler proved to be a fairly straightforward process, PTTEP AA had a much bigger challenge ahead. Many of the engineers had given up on the document management system as they couldn’t find what they were looking for. The company had to “re-educate” and reassure engineers that the issue had been resolved and that they would be able to find everything they were looking for. Other benefits “contentCrawler really complimented the document management system product, whose reputation had taken a beating. contentCrawler went a long way to restoring everyone’s faith in the product,” concluded Trina. In summary Engineers at PTTEP AA, a wholly owned subsidiary of PTT Exploration and Production, were experiencing increasing delays and frustration accessing and retrieving documents from its document management system. Many of the 500,000 documents relating to the Montara Ventura FPSO vessel supplied by contractors were discovered to be image-based documents and therefore invisible to search. On the recommendation of its document management system consultants, DocsCorp’s contentCrawler, a fully automated, end-to-end solution resolved the problem by finding and converting the image-based documents to text-searchable PDFs. More importantly, it went a long way to restoring faith in the document management system.