Model
A mathematical system that represents or
expresses the functional behavior of a real system.
Modeling
Data
Also known as "Training
Data", this data used by the modeling process to calculate and
set model parameters and coefficients. This data is usually highly fit by the models.
Non-Stationary
Relationships
that vary and move between data series. Non-stationary relationships hold true for
short periods of time.
Optimization
The act of seeking particular actionable
variables' values such that one or more results are minimized, maximized
or targeted to desired values.
Outlier
An outlier is a data point whose value is
significantly infrequent when compared to the frequency of other values.
In uni-modal samples, "infrequent" may be defined as a given
number of standard deviations (sigma)
from the variable's mean. In multi-modal systems, an outlier may
occur between modes or outside the higher and lower modes. In
multivariate situations, it maybe some defined distance from all cluster
centers.
An outlier may be
valid or erroneous data.
Particular care should be taken before excluding any outliers unless
they can be proven to be erroneous using the underlying knowledge about
the data, its collection and it's application. The decision of
inclusion/omission is a difficult one and includes the degree or
probability of the data point being erroneous and the impact and
risk of negative consequences of omitting each reading individually.
Out-Of-Sample
Data
Data held back from the modeling process
to determine the true performance of a model.
|