How to Accurately Migrate Massive Amounts of Engineering Data in Minutes
Data is an asset, and operators know that digitizing and moving legacy engineering data into company-managed platforms and maintaining quality data-centric systems in-house is necessary to ensure competitive advantage and mitigate risks. However, digital transformation can be a tedious and arduous process, not for the faint of heart. The problem is that legacy engineering data is typically not structured in neatly labeled columns and rows or rich with metatags as it is in many other domains. Instead, engineering data can be in the form of PDFs, CAD files, Excel workbooks, Word documents, napkins, and other ambiguous formats. In most cases, these unstructured documents or renderings have little or no metadata at all.
To add to the confusion, not only does legacy data come in various file types, but even within the same file type, there could be numerous formats for the same asset. Consider what happens when digitizing data for current assets, such as control valves. A facility could have hundreds of control valves installed within the facility by several different vendors or contractors, each with their unique way of classifying and labeling the specifics of the valve. This leads to the same data point for each valve being labeled differently across the specification documentation. One may call a data point “type,” the other may call the same data point “material,” and yet another might call it “metal.” Before this data can be migrated to digitize the facility, it must be formed, mapped, and rationalized.
There are traditionally two approaches to solving data rationalization: manual and automated. Unfortunately, these approaches have their flaws in the outcome and the way progress is measured. Meaning, metrics are set to assess how quickly the data moves from one system to another with no appreciation for the time it takes to properly tag and map the data.
First is the laborious and costly approach of hiring an engineering company to rationalize and migrate the data manually. Of the two traditional methods, this results in the highest data integrity. However, this approach is slow, and the interruption to business can be drawn-out and frustrating taking months to complete. Additionally, because this is a lengthy manual effort with highly skilled resources, it is often cost-prohibitive.
The second is an automated approach. Automation can rely on A.I. to rationalize and migrate the data. Unfortunately, automatic data migration is often done by well-meaning software vendors and I.T. consulting companies not trained on the uniqueness of engineering data, who will naively oversimplify the amount of effort to digitize legacy engineering, facilities, and asset data. They have little understanding of the intricacies of engineering data and believe that applying the same approach as they would for other domains would work. They will say, “it is easy, we will put it in a data lake, then A.I. and Big Data programs will do the work.” But is that true? Technically, yes, the data may be there, but can you find it upon search? And, in the cases of documents and drawings without metadata, the difference between putting them in a data lake and a black hole is negligible. Additionally, the lack of engineering-specific knowledge prevents these vendors from being able to see issues upon review. The outcome is low data integrity resulting in low trust and increased risk. The cost is lower, but the resulting clean-up of such an approach is expensive, with overall costs exceeding that of the more laborious manual approach.
There is another approach that marries the two traditional methods. It utilizes technology to automate the transfer, but engineers to rationalize data and map the outcomes. The result is a migration with high data integrity at a cost far lower than the traditional manual approach, with brief or no business interruption. The secret is in the upfront mapping and rationalization. By assessing each of the data formats and intelligently tagging and mapping each to the new digital environment with sample testing along the way, the result is quality data that is trustworthy.
The pressure to measure success by how quickly the migration starts and judging the progress of the migration and not rationalization are mistaken. Patience is key. The adage, “measure twice, cut once,” applies. Invest the time in intelligently mapping the data, and the migration should be as simple as hitting an “easy button.” We recently moved over 5.6 million cells in less than 18 minutes for a client, but it took three months to map and verify. The client is not only thrilled with the high quality of data now available to its engineers and key stakeholders, but the project came in under budget and sooner than promised.
Data migration indeed can be as easy as hitting the “easy button.” ProLytX is an Engineering I.T. firm based in Houston, TX, and is a leader in this field, coaching clients to success with a unique combination of engineering and I.T. skills. If you want to learn more about ProLytX and how we can help you bridge the gap between I.T. and Engineering, find us at www.prolytx.com.