- Find legacy documents that were not searchable
- Make undiscoverable documents in their in their enterprise content management system (ECM) searchable
- Create a new workflow for document contributors to prevent the creation of non-searchable files
- Implement an OCR workflow that did not add complexity or disruption to staff
contentCrawler as a solution
- Reduced the time spent searching for hidden documents
- Increased user confidence in the ECM with more accurate search results
- Provided a backend process which automatically OCRs files going in
- Uncovered hidden files already in their ECM and processed them through the OCR module
- Supports faster, more efficient processes that reduces the amount of time spent OCR’ing documents
- A more efficient OCR process allows the firm to service clients faster
- Runs as a fully-automated OCR processing framework 24/7 with no staff intervention required
- Is a cost-effective solution that does the job of two separate products; it discovers existing invisible documents while simultaneously and automatically OCR’ing new documents added into the ECM
About SITE Centers
SITE Centers is an owner and manager of value-oriented shopping centers concentrated in high
barrier-to-entry markets with stable population and high growth potential. SITE Centers is a self-administered and self-managed REIT operating as a fully integrated real estate company and is publicly traded on the New York Stock Exchange.
Challenge: Hidden documents impacting the user's trust in the content management system
Kim Scharf is Vice President of IT Enterprise Services at SITE Centers. Scharf noted that, "As a real estate
company, our business is driven by documents. These documents represent agreements with our
tenants, supplier contracts and various other business-critical records. SITE Centers needed to digitize
these critical business documents for ease of access across the enterprise." As such, they invested
in the OpenText Enterprise Information Management (EIM) suite of products including OpenText
Content Server to serve as their records managed document repository.
The challenge that Scharf and her team faced was a significant number of documents in their
content management system that were not OCR'd. This led to potential issues with performing
full-text searches across the repository and some documents not being returned in search results.
These documents were image-only files such as PNG, TIFF and image PDF documents with
no recognizable text for the search index. Scharf explained, "We recognized this as a potential
technology liability which was impacting the credibility of the content management system and
needed to be resolved as a priority."
Christopher Barrett, Director of IT Enterprise Services for SITE Centers, recalled: "We attempted to develop
an in-house solution to resolve this problem in the past. However, we encountered several technical
hurdles. Recently, we discovered a product from DocsCorp that offers direct integration between
OpenText Content Server and DocsCorp's contentCrawler OCR software. We contacted DocsCorp
and put a proof of concept together."
Barrett noted that the recent addition of OpenText Enterprise Connect at SITE Centers was
compounding the issue of non-OCR'd documents being contributed to OpenText Content Server.
Document contributors would drag-and-drop documents directly into Content Server rather than
contributing documents through the existing enterprise scanning solution which bypassed the
OCR process. According to Barrett, "contentCrawler fixes the problem by OCR'ing documents on
the back end – the document contributor does not have to do anything differently. Our goal was
to not add complexity and disrupt our employees' business processes. Fortunately, we were able
to meet our goal and address the problem from a technology perspective with contentCrawler."
Solution: A dual-mode OCR framework to process legacy and new files simultaneously
Implementation of contentCrawler at SITE Centers was fairly straightforward and quick. Within four
weeks, the company had a "proof of concept" set up where contentCrawler was integrated with
Content Server in a non-production environment. The production implementation took another
month to complete.
Now that the initial install is complete, contentCrawler is combing through hundreds of thousands
of legacy documents divided by year to avoid overloading the processors and inhibiting business
productivity. New content added to Content Server is examined by contentCrawler to make sure
it is OCR'd and text-searchable. For audit purposes, SITE Centers must show that the OCR'd copy is a
separate version, so a completely separate document is created.
SITE Centers worked closely with DocsCorp technical support professionals to complete the successful
installation of contentCrawler and integration with OpenText Content Server. Scharf and Barrett
both commented that DocsCorp had fabulous technical and customer support. According to
Scharf: "Working with DocsCorp, the spirit of partnership is definitely there. We appreciate it when
an external vendor recognizes and collaborates with our internal IT talent. Our experience with
DocsCorp was refreshing and we are perfectly satisfied contentCrawler customers."