Massive amounts of operational data are coming online with the ever-increasing set of advanced devices and equipment, as a result of a movement often referred to as the Industrial Internet. Big data is the proliferation of data from these systems, devices and applications whose size makes it challenging to capture, manage, and process within a tolerable period of time using traditional software solutions.
Businesses everywhere, including industrial enterprises, face mounting pressure to stay competitive with data-driven strategies—requiring increasingly more data, which results in the accumulation of larger and larger data sets. In addition, evolving and evermore stringent regulatory requirements necessitate the collection of more information as proof for audit and compliance purposes.
Beyond the Capability of Traditional Data Management Systems
The volume of data (ranging from a few dozen terabytes to many petabytes of data in a single data set) from which to extract value is beyond the capability of a traditional data management system. What is more, the challenge of managing big data for industry goes beyond the sheer volume of information; there is the diversity and complexity of data, which comes in various formats and from disparate sources. There are typically “islands” of process information that must be aggregated, stored, and analyzed to derive context and meaningful value.
To leverage big data, industrial businesses need the ability to support different types of information, the infrastructure to store massive data sets, and the flexibility to leverage the information once it is collected and stored—enabling historical analysis of critical trends to enable real-time predictive analysis. As businesses increasingly realize that much more of their value proposition is information-based, technologies that can address big data are quickly gaining traction.
Luckily for industrial companies, Google, Yahoo and Facebook are pushing the envelope on big data needs. Their desire to analyze clickstreams, web logs, and social interactions has forced them to create new tools for storing and analyzing large data sets.One of those tools is Hadoop.
Hadoop for Industrial Data Sets
Hadoop is an Open Source technology that is rapidly evolving. It is a tool that enables data storage scale through the use of commodity hardware, distributing data across many low-cost computers. Once distributed, new challenges arise in locating and processing the data, which are addressed by MapReduce, a framework where data is processed in parallel across many nodes in a cluster. It allows processing to be mapped to the data across many locations, and then reduces the outputs for similar data elements into a single result.
While Hadoop may have big promise for handling large data sets, the complexity involved and the specialized skillset needed to create a Hadoop environment is often beyond the ability of industrial businesses. Yet these businesses still need to scale across the enterprise to handle large sets of time-series data generated in manufacturing and other industrial operations.
For example, a manufacturing manager may want to understand the significance of temperature variation on quality as the rate of flow of materials varies through a production line; or a power plant supervisor may want to analyze five years of past data to examine anomalies and variations to understand whether they were followed by subsequent outages to enable predictive analysis. This level of operational insight requires the ability to quickly run a query against large data sets for specific time periods—a unique and powerful capability that calls for an industrial data solution.
Historian Software
As data sets grow larger and more complex, advanced historian software offer an effective, simple, and easy way for companies to efficiently leverage vast amounts of real-time and historical process data, a critical need for optimized decision support. They help companies easily connect and collect their data from various systems and devices, making it accessible to uncover intelligence that would otherwise be locked away in the data.
While historian software may not yet be top of mind when it comes to industrial big data solutions, what many companies may not realize is that these advanced, out-of-the-box solutions are specifically designed to efficiently collect, store and manage large volumes of time-series process data, which is precisely the industrial big data challenge.
To learn more about industrial big data, Hadoop and Advanced Historian Software, please download the attached document below called, GE Intelligent Platforms’ The Rise of Industrial Big Data, which was the source of information for this document.