A Probability Model consists of the triplet (Ω,F,P)(Ω,F,P), where ΩΩ is the sample space, FF is a σσ−algebra (events) and PP is a probability measure on FF.
Intuitive explanation. A probability model can be interpreted as a known random variable XX. For example, let XX be a Normally distributed random variable with mean 00 and variance 11. In this case the probability measure PP is associated with the Cumulative Distribution Function (CDF) FF through
F(x)=P(X≤x)=P(ω∈Ω:X(ω)≤x)=∫x−∞12π−−√exp(−t22)dt.F(x)=P(X≤x)=P(ω∈Ω:X(ω)≤x)=∫−∞x12πexp(−t22)dt.
Generalisations. The definition of Probability Model depends on the mathematical definition of probability, see for example Free probability and Quantum probability.
A Statistical Model is a set SS of probability models, this is, a set of probability measures/distributions on the sample space ΩΩ.
This set of probability distributions is usually selected for modelling a certain phenomenon from which we have data.
Intuitive explanation. In a Statistical Model, the parameters and the distribution that describe a certain phenomenon are both unknown. An example of this is the familiy of Normal distributions with mean μ∈Rμ∈R and variance σ2∈R+σ2∈R+, this is, both parameters are unknown and you typically want to use the data set for estimating the parameters (i.e. selecting an element of SS). This set of distributions can be chosen on any ΩΩ and FF, but, if I am not mistaken, in a real example only those defined on the same pair (Ω,F)(Ω,F) are reasonable to consider.
Generalisations. This paper provides a very formal definition of Statistical Model, but the author mentions that "Bayesian model requires an additional component in the form of a prior distribution ... Although Bayesian formulations are not the primary focus of this paper". Therefore the definition of Statistical Model depend on the kind of model we use: parametric or nonparametric. Also in the parametric setting, the definition depends on how parameters are treated (e.g. Classical vs. Bayesian).
The difference is: in a probability model you know exactly the probability measure, for example a Normal(μ0,σ20)Normal(μ0,σ02), where μ0,σ20μ0,σ02 are known parameters., while in a statistical model you consider sets of distributions, for example Normal(μ,σ2)Normal(μ,σ2), where μ,σ2μ,σ2 are unknown parameters.
None of them require a data set, but I would say that a Statistical model is usually selected for modelling one.
This post was edited by Nitara Bobal at August 3, 2020 4:30 PM IST