Why are algorithms and data-driven models only available for the few domain experts who also are fluent in advanced software coding? We certainly don’t believe this should be the case.
We are making them available to the rest of the world — particularly to non-coding domain experts.
For many years industrial data scientists have been building smart algorithms to solve complex industrial problems. They are now available in a no-code drag-and-drop intuitive interface. Liberate your data and empower domain experts the tools to drive impact every day.
Cognite Charts includes several data science toolboxes that provide subject matter experts (SMEs) out-of-the-box algorithms to process and manipulate data, conduct root cause analysis (RCA) and develop solutions without having to code.
The toolboxes cover basic operations, statistical methods, data transformation, and advanced models. They work out-of-the-box with Cognite Charts, and we will continuously add new algorithms, features, and functionality.
You can view the detailed documentation for our Industrial Data Science Library (InDSL) by clicking here. Of course, you will find and use all of the functions and algorithms in the InDSL via the calculation builder in Charts (charts.cogniteapp.com).
Below, we’ve included a brief description of the different types of toolboxes available in Charts.
The Operators toolbox contains all the standard arithmetic and algebraic operations that you can use with time series data (addition, subtraction, multiplication, division) and more advanced calculations such as differentiation, integration, time series mapping, and more.
Filters are algorithms that remove parts of a time series to capture the underlying signal. For example, low-pass filters remove the high-frequency noise of a time series. You can also use filters in conjunction with event detectors to remove undesired phenomena in a time series. For instance, you can use an anomaly detector to map the time series to a binary array indicating the presence or absence of an anomaly. Then apply a boolean mask on the raw time series to remove all detected anomalies.
You can map time series to a set of discrete variables that indicate the presence or absence of an event. For instance, a steady and transient operation can be determined when large step-changes occur in the sensor reading (potentially due to valve changes, start-ups, etc.). Another example is anomalies where significant unexpected changes in the sensor reading can occur before returning to normal behavior (e.g., spike in value). The Detect module contains algorithms that perform this task of mapping continuous time series to discrete variables based on the behavior of the time series.
Industrial data is in most cases non-uniformly sampled, and before using the data as part of a model with other time series, the data has to be pre-processed. Resampling data is a typical pre-processing step. The resample toolbox offers a variety of methods to resample your data. This toolbox provides classical resampling methods (e.g., interpolation) and advanced machine learning algorithms to down- or up-sample your data.
Smoothers modify time series to boost the main underlying trend and remove fine-scaled phenomena. You can do this in several different ways. Some examples of smoothers found in this toolbox are: filter to remove higher frequency phenomena from the raw data (e.g. Butterworth or Chebyshev low-pass filters), regression-based smoothers that estimate the coefficients of a parametric function to predict the underlying signal, or moving averages that applying a rolling operation on a user-defined window.
The Statistics toolbox offers various algorithms to describe, analyze, and model industrial time series data. This toolbox is ideal to describe your data, conduct root cause analysis and exploratory work. The algorithms range from descriptive statistics to linear and nonlinear regression analysis to ML methods (e.g., classification/clustering).
Accurate data is a fundamental part of any industrial model. This toolbox contains a collection of advanced algorithms to evaluate, monitor, and improve the data quality of time series. There are multiple dimensions regarding time series data quality: accuracy, timeliness, completeness, validity, consistency, uniqueness. The algorithms in this toolbox provide methods in all dimensions while focusing heavily on ACCURACY. If the data is not correct, the other dimensions are of little importance. Examples of functions found in this toolbox are data gap detection and filling, outlier detection and removal, and sensor drift.
The Regression toolbox focuses on using classical methods (linear and nonlinear models) and machine learning regression algorithms to describe the relationship between industrial data and physical parameters. It enables you to conduct semiautomatic mapping parameters to historical data and forecast its behaviors several steps into the future.
Oil and gas
This module contains algorithms particularly relevant to the oil and gas industry. You will find methods to estimate parameters such as the Productivity Index (gas flow rate divided by the difference between the reservoir and bottom hole pressures), pressure drop, single-phase flow rate, hydrostatic head, and many others.
The Forecast toolbox offers a variety of machine learning algorithms to forecast the behavior of industrial time series, with a particular focus on forecasting based on the correlation between a time series and physical parameters. Forecasting involves learning from historical data to make a prediction several time steps into the future. For industrial time series analysis, this typically involves pre-processing the data, training a parametric time series model, and then predicting the result by a user-defined number of steps into the future.