# System Identification

Extracting a mathematical model from experimental data is a tedious task. We look at the world through ‘dirty’ glasses: when we measure a length, the weight of a mass, the current or voltage, . . . etc., we always make errors since the instruments we use are not perfect. The measurements are disturbed with systematic (e.g. calibration errors) and stochastic errors (random fluctuations due to noise disturbances). We also do not always know all the signals that excite the system. Random fluctuating unmeasured inputs create additional variations of the output, called process noise. Due to the measurement and process noise, it becomes impossible to predict exactly the output of the system. Noise in a radio receiver, Brownian motion of small particles, variation of the wind speed in a thunderstorm are all illustrations of this nature.

Starting from these noisy observations, we want to build a mathematical model that explains the models ‘as good as possible’. Usually, the model is split in a deterministic part and a stochastic part. The deterministic plant model captures the “undisturbed” system output, while the noise distortions are described by the noise model. The system model can suffer from systematic (bias) errors and an increased uncertainty if the data are not properly processed. Being unaware of these problems can lead the user to wrong models that result in poor designs.

The aim of system identification is twofold:
– Provide a systematic approach to fit the plant and noise model as well as possible by reducing the impact of the noise distortions.
– Provide a characterization of the noise disturbances (e.g. power spectrum), and calculate uncertainty bounds on the estimated plant model.

A brief introduction to the system identification theory is given in the preliminary set of slides that is organized along the following topics:
Discovery of Ceres: The slides start with the discovery story of the planetoid Ceres that can be considered as one of the first modern identification applications. On the 1st of January 1801, Piazzi observed Ceres for the first time and started to track its position. After a few weeks, the observations stopped, and the astronomers couldn’t retrieve the planetoid later that year when it reappeared at the other side of the sun. Starting from Piazzi’s early observations, Gauss estimated the parameters of Ceres’ orbital model that follows from Newton’s laws of gravitation and motion. To do this, he matched the model to the available observations by tuning the orbital parameters so that a ‘cost’ function, that quantifies the distance between data and model, is minimized. The final model was used to predict the future positions of Ceres. On December 31, von Zach could observe Ceres again, close to the predicted position (model validation). This story brings the main actors in today’s system identification theory (data, model, matching model-data, validation) together.

The main actors in System Identification theory:
– data: the experimental data
– model: set of models
– matching model-data: choice of the cost function
– validation: test the model on new data

Why do you need system identification? To motivate the reader, we analyze a very simple example that illustrates many of the aspects and problems that appear in system identification. Starting from noisy voltage and current measurements, the value of a resistor is estimated. Three simple estimators are proposed, and it is shown that only one of these gives acceptable results while the other two fail completely, without any warning. This shows that there is a clear need for a systematic approach to data based mathematical modeling. Without such a framework, users get fooled with poor results without any notice.
Characterizing estimators provides the tools to characterize an estimator, making also clear what can be expected from a good estimator for a given data set. The topics covered are: i) Study of the systematic errors (bias, consistency); ii) Characterizing the variability of estimator by its covariance matrix; iii) Quantifying the amount of information in experimental data.
Family of estimators: Eventually, an introduction is given to a family of estimators (least squares, weighted least squares, maximum likelihood, Bayes) that use a growing amount of knowledge to get improved results.

Preliminary slides
System Identification: from data to model