Big Data, data analytics, and machine learning have opportunities for diagnostic, prognostics, and decision making in the process industries. This presentation describes the emerging role of these tools, including for:

  1. The extraction of information from high-order tensor data structures, which arise from spatially resolved photoluminescence, spectral imaging, and color video

  2. The integration of information from a variety of types, such as process flowsheet structure, causal relationships between variables, and real-time sensor data

  3. Management of the effects of uncertainties, disturbances, faulty sensors, variation in operator practices, and machine drift

  4. Addressing nonlinear dynamic operating conditions of simultaneous discrete and continuous nature such as startup, shutdown, and product changeovers.

Several examples are provided where wrong conclusions or poor performance are obtained during the seemingly reasonable application of tools for the analysis of industrial datasets, which illustrate some common misconceptions and the potential pitfalls of applying tools naively. Some specific research directions motivated by the challenges are outlined.