Advanced Tools for Data Analytics Workshop

Description

This workshop will introduce the essential machine learning algorithms and software tools for graduate students, experienced researchers and engineers working in the industry. Elementary knowledge of probability and statistics is required to attend this workshop. The workshop will also feature at least two confirmed guest speakers with decades of experience in data analytics. Prof. Sirish L. Shah of University of Alberta and Prof. Richard D. Braatz of Massachusetts Institute of Technology have both agreed to speak during the workshop. We have also invited an industrial guest speaker.

Workshop Overview

We are currently at the cusp of what is considered the fourth industrial revolution. This revolution is driven by the ubiquitous cyber-physical systems, algorithmic developments in artificial intelligence, gargantuan computing power, inexpensive memory and the gigantic volumes of data that are being collected. The process industries are in possession of treasure troves of heterogenous data that is gravely under utilized. The competitive global environment, and the ever increasing demands on energy, environment and quality are subjecting these industries to a high level of economic pressure. The incredible volumes of data that they already possess are poised to provide a level of automation and efficiency never seen before and thus alleviate the economic and competitive pressures.

Process industries have been using data analytics in various forms for more than three decades. In particular, statistical techniques such as principal component analysis (PCA), partial least squares (PLS), canonical variate analysis (CVA) and time series methods for modeling such as maximum likelihood estimation, prediction error methods have been extensively applied on industrial data. The recent developments in machine learning and artificial intelligence provide a new opening for using process data on large scale problems. However, in order to successfully apply machine learning methods to process data, researchers require not only a high level understanding of the algorithms but also strong programming knowledge in packages such as Python, TensorFlow, Keras and Jupyter.

Guest Speakers

Dr. Richard Braatz

Dr. Richard D. Braatz is the Edwin R. Gilliland Professor of Chemical Engineering at the Massachusetts Institute of Technology (MIT) where he does research in applied mathematics and control theory and their application to chemical and biological systems. He received an MS and PhD from the California Institute of Technology and was the Millennium Chair and Professor at the University of Illinois at Urbana-Champaign and a Visiting Scholar at Harvard University before moving to MIT. He has consulted or collaborated with more than 20 companies including IBM, United Technologies Corporation, Novartis, and Abbott Laboratories. Honors include the Donald P. Eckman Award from the American Automatic Control Council, the Curtis W. McGraw Research Award from the Engineering Research Council, and the AIChE Computing in Chemical Engineering Award. He is a Fellow of the Institute of Electrical and Electronics Engineers, International Federation of Automatic Control, and the American Association for the Advancement of Science. For more information, see Dr. Braatz's page.

Dr. Sirish Shah

Dr. Sirish L. Shah has been with the University of Alberta since 1978, where he held the NSERC-Matrikon-Suncor-iCORE Senior Industrial Research Chair in Computer Process Control from 2000 to 2012. He is the recipient of the Albright & Wilson Americas Award in 1989, the Killam Professor in 2003, the D.G. Fisher Award for significant contributions in the field of systems and control, the ASTECH award in 2011 and the 2015-IEEE Transition to Practice Award. He has held visiting appointments at Oxford University and Balliol College as a SERC fellow, Kumamoto University (Japan) as a senior research fellow of the Japan Society for the Promotion of Science (JSPS), the University of Newcastle, Australia, IIT-Madras India and the National University of Singapore. The main areas of his current research are process and performance monitoring, analysis and rationalization of alarm systems. He has co-authored three books, the first titled, Performance Assessment of Control Loops: Theory and Applications, a second titled ‘Diagnosis of Process Nonlinearities and Valve Stiction: Data Driven Approaches”, and a more recent monograph on “Capturing connectivity and causality in complex industrial processes”. He is emeritus professor at the University of Alberta, a fellow of the Canadian Academy of Engineering and the Chemical Institute of Canada. For more information, visit Sirish Shah's page.

Organizers

Dr. Bhushan Gopaluni

Dr. Bhushan Gopaluni is a professor in the department of chemical and biological engineering and an Associate Dean for Education and Professional Development in the faculty of Applied Science at the University of British Columbia. He is also an associate faculty in the Institute of Applied Mathematics, the Institute for Computing, Information and Cognitive Systems, Pulp and Paper Center and the Clean Energy Research Center. He is currently an associate editor for Journal of Process Control, The Journal of Franklin Institute, guest editor for Process Control Special Series in the Canadian Journal of Chemical Engineering. He received a Ph.D. from the University of Alberta in 2003 and a Bachelor of Technology from the Indian Institute of Technology, Madras in 1997 both in the filed of chemical engineering. From 2003 to 2005 he worked as an engineering consultant at Matrikon Inc. (now Honeywell Process Solutions) during which he had designed and commissioned multivariable controllers in British Columbia’s pulp and paper industry, and had implemented numerous controller performance monitoring projects in the Oil & Gas and other chemical industries. He is one of the leading experts on data analytics for process industry and has authored over 110 refereed articles in reputed international Journals and conferences. His publications have been recognized through best paper awards and keynote presentations. He is also the recipient of the prestigious Killam Teaching Prize and the Dean’s service medal from the University of British Columbia. For more information, visit the DAIS page.

Lee Rippon

Lee Rippon is a PhD student studying Chemical and Biologial Engineering (CHBE) at UBC. He also holds BASc and MASc degrees from UBC in CHBE where his research experience includes applications of compressive sensing, adaptive control, system identification and process monitoring on sheet and film processes. His current research interests include applying statistical machine learning techniques to historical process data to perform fault detection, isolation, and diagnosis in a kraft process. For more information, visit the DAIS page.

Yiting Tsai

Yiting Tsai has finished both his BASc and MASc degrees at CHBE. His interests are process control and statistical modeelling of time-series data. His current PhD research focuses on the application of Machine Learning techniques to design smart controllers, which identify and predict process faults ahead of time and apply appropriate control actions to prevent such faults. For more information, visit the DAIS page

Dr. Aditya Tulsyan

Dr. Aditya Tulsyan is currently a Senior Engineer at Amgen. Prior to joining Amgen in 2016, Aditya was a Postdoctoral Associate in the Process Systems Engineering Laboratory at the Massachusetts Institute of Technology. He received his Ph.D. in Computer Process Control from the University of Alberta, Canada in 2013. He has held research positions at the National University of Singapore, University of British Columbia and the Indian Institute of Technology, Kharagpur. His research interests are in systems engineering, statistical machine learning, signal processing and Bayesian inference. For more information, visit Dr. Tulsyan's page.

Course Plan

Starting with an elementary introduction to statistics and probability, we will develop various regression, classification, dimensionlity reduction and advanced learning algorithms that are of interest to engineers. In addition, various widely-used machine learning software packages will be introduced. Registrants will solve exercises and receive take-away software code to implement these algorithms. The following is a general outline of the course:

Basics of probability and statistics, underfitting, overfitting and bias-variance tradeoff

Classification Algorithms

Support Vector Machines
Naive Bayes Classifier

Regression Algorithms

Linear Least Squares
Kernel Regression

Dimensionality Reduction Algorithms

Principal Component Analysis (PCA)
Partial Least Squares (PLS)
Isometric Mapping (ISOMAP)

Advanced Learning Algorithms

Deep Learning
Recurrent Neural Networks
Gaussian Processes

Applications in the Process Industry

Learning Outcomes

By the end of this workshop, registrants will be able to:

identify and solve classification, regression and dimensionality reduction problems
work with softwares such as Python, TensorFlow, and Keras

Schedule

8:30AM - 9:30AM

Richard Braatz

Big Data Analytics

9:30AM - 10:30AM

Bhushan Gopaluni

Classification

10:30AM - 11:30AM

Yiting Tsai

Regression

11:30AM - 12:30PM

Bhushan Gopaluni, Yiting Tsai

Dimensionality Manipulation

12:30PM - 1:30PM

Lunch

1:30PM - 2:30PM

Sirish Shah

Alarm Management

2:30PM - 5:30PM

Bhushan Gopaluni, Aditya Tulsyan, Yiting Tsai

Advanced Learning Algorithms

Software

This workshop will be using Python code in Jupyter notebooks. If you would like to follow along with the code during the workshop, it is recommended that you install the required software.

For a guide to installation, follow the TensorFlow pages provided below. The TensorFlow page contains instructions on setting up the environment properly.

Abstracts

Sensor and Alarm Data Tools for Process Analytics

By Dr. Sirish L. Shah

Abstract Process data analytic methods rely on the notion of sensor fusion whereby data from many sensors and alarm tags are combined with process information, such as physical connectivity of process units, to give a holistic picture of health of an integrated plant. The discovery and learning from process data... [Read More]

The Emerging Role of Big Data, Data Analytics, and Machine Learning in the Process Industries

By Dr. Richard Braatz

Big Data, data analytics, and machine learning have opportunities for diagnostic, prognostics, and decision making in the process industries. This presentation describes the emerging role of these tools, including for: [Read More]

Workshop Overview

Guest Speakers

Organizers

Course Plan

Schedule

Software

Abstracts

Description

Workshop Overview

Guest Speakers

Dr. Richard Braatz

Dr. Sirish Shah

Organizers

Dr. Bhushan Gopaluni

Lee Rippon

Yiting Tsai

Dr. Aditya Tulsyan

Course Plan

Learning Outcomes

Schedule

Software

Mac Installation

Windows Installation

Ubuntu Installation

Abstracts

Sensor and Alarm Data Tools for Process Analytics

By Dr. Sirish L. Shah

The Emerging Role of Big Data, Data Analytics, and Machine Learning in the Process Industries

By Dr. Richard Braatz