Case Studies
LexisNexis efficiently transforms operations and improves the accuracy, speed and completeness of content acquisition by partnering with Informatica to deploy a massive-scale content collection and conversion platform as a key element of its strategic architecture.
LexisNexis faces the challenge of assimilating information from billions of documents imported from nearly 50,000 sources – a vast variety of information including that from state and federal level government, news publishers like the New York Times and Cable News Network, private company/financial information providers such as Dun & Bradstreet, and public records from around the globe. Partners supply LexisNexis with both structured and unstructured data, in formats as varied as Microsoft Office and portable document formats, individual database records, as well as plain text files. In the past, ingesting all this content was a largely ad-hoc process requiring custom coding across multiple parts of the LexisNexis organization. The cost and time taken to accommodate the rising volume of unstructured source data was a limitation on business growth.
In order to improve the onboarding process and speed up the delivery of more and newer content to customer facing products, LexisNexis embarked on a project to deploy a single integrated content collection and conversion platform that would improve the accuracy, completeness and speed of content acquisition and delivery.
In parallel, LexisNexis focused on elevating operational efficiencies by optimizing the way its content is structured and stored by redesigning the data architecture. The company adopted a modularized XML schema approach, allowing the content acquisition process to be quickly tailored to match the characteristics of each individual source content provider. A library of templates for processes, interfaces, and workflows was created to support the efficient onboarding of the diverse and volatile nature of the sources.
As originally published on the Informatica.com website
The use of a single integrated platform with enhanced data handling capabilities enabled content to be rapidly collected and transformed to target XML format. The calendar time required for establishing a new content source and onboarding it into production operation has been decreased by 30 percent. The use of a standardized platform also has enabled maintenance costs to be reduced by nearly 50 percent.
LexisNexis has been able to successfully implement collection and conversion processes for hundreds of millions of documents arriving from thousands of sources; transforming a variety of incoming formats to strategic data structures optimized for storage and editorial as well as customer search and retrieval.
The resultant infrastructure uses a set of highly modular components, reusable sessions, and a library of templates, including user interfaces and workflow definitions. Significant improvement in the efficiency of content collection and transformation is proving to reduce the ongoing operational cost of creating richer information for customers.
Created library of reusable transformation modules to expedite content conversion
Improved timeliness, accuracy and completeness of content enhances customer experience, and differentiates offerings in the marketplace