QBoard » Statistical modeling » Stats - Conceptual » Differences between a statistical model and a probability model?

Differences between a statistical model and a probability model?

  • Applied probability is an important branch in probability, including computational probability. Since statistics is using probability theory to construct models to deal with data, as my understanding, I am wondering what's the essential difference between statistical model and probability model? Probability model does not need real data? Thanks.
      August 3, 2020 4:27 PM IST
    0
  • A Probability Model consists of the triplet (Ω,F,P)(Ω,F,P), where ΩΩ is the sample space, FF is a σσ−algebra (events) and PP is a probability measure on FF.

    Intuitive explanation. A probability model can be interpreted as a known random variable XX. For example, let XX be a Normally distributed random variable with mean 00 and variance 11. In this case the probability measure PP is associated with the Cumulative Distribution Function (CDF) FF through

    F(x)=P(X≤x)=P(ω∈Ω:X(ω)≤x)=∫x−∞12π−−√exp(−t22)dt.F(x)=P(X≤x)=P(ω∈Ω:X(ω)≤x)=∫−∞x12πexp⁡(−t22)dt.
    Generalisations. The definition of Probability Model depends on the mathematical definition of probability, see for example Free probability and Quantum probability.

    A Statistical Model is a set SS of probability models, this is, a set of probability measures/distributions on the sample space ΩΩ.

    This set of probability distributions is usually selected for modelling a certain phenomenon from which we have data.

    Intuitive explanation. In a Statistical Model, the parameters and the distribution that describe a certain phenomenon are both unknown. An example of this is the familiy of Normal distributions with mean μ∈Rμ∈R and variance σ2∈R+σ2∈R+, this is, both parameters are unknown and you typically want to use the data set for estimating the parameters (i.e. selecting an element of SS). This set of distributions can be chosen on any ΩΩ and FF, but, if I am not mistaken, in a real example only those defined on the same pair (Ω,F)(Ω,F) are reasonable to consider.

    Generalisations. This paper provides a very formal definition of Statistical Model, but the author mentions that "Bayesian model requires an additional component in the form of a prior distribution ... Although Bayesian formulations are not the primary focus of this paper". Therefore the definition of Statistical Model depend on the kind of model we use: parametric or nonparametric. Also in the parametric setting, the definition depends on how parameters are treated (e.g. Classical vs. Bayesian).

    The difference is: in a probability model you know exactly the probability measure, for example a Normal(μ0,σ20)Normal(μ0,σ02), where μ0,σ20μ0,σ02 are known parameters., while in a statistical model you consider sets of distributions, for example Normal(μ,σ2)Normal(μ,σ2), where μ,σ2μ,σ2 are unknown parameters.
    None of them require a data set, but I would say that a Statistical model is usually selected for modelling one. This post was edited by Nitara Bobal at August 3, 2020 4:30 PM IST
      August 3, 2020 4:29 PM IST
    0
  • Probability and statistics are related areas of mathematics which concern themselves with analyzing the relative frequency of events. Still, there are fundamental differences in the way they see the world:

     

     

    • Probability deals with predicting the likelihood of future events, while statistics involves the analysis of the frequency of past events.   

       

    • Probability is primarily a theoretical branch of mathematics, which studies the consequences of mathematical definitions. Statistics is primarily an applied branch of mathematics, which tries to make sense of observations in the real world.

       

    Both subjects are important, relevant, and useful. But they are different, and understanding the distinction is crucial in properly interpreting the relevance of mathematical evidence. Many a gambler has gone to a cold and lonely grave for failing to make the proper distinction between probability and statistics.

    This distinction will perhaps become clearer if we trace the thought process of a mathematician encountering her first craps game:

     

     

    • If this mathematician were a probabilist, she would see the dice and think ``Six-sided dice? Presumably each face of the dice is equally likely to land face up. Now assuming that each face comes up with probability 1/6, I can figure out what my chances of crapping out are.''

       

    • If instead a statistician wandered by, she would see the dice and think ``Those dice may look OK, but how do I know that they are not loaded? I'll watch a while, and keep track of how often each number comes up. Then I can decide if my observations are consistent with the assumption of equal-probability faces. Once I'm confident enough that the dice are fair, I'll call a probabilist to tell me how to play.''

       

    In summary, probability theory enables us to find the consequences of a given ideal world, while statistical theory enables us to to measure the extent to which our world is ideal.

    Modern probability theory emerged from the dice tables of France in 1654. Chevalier de Méré, a French nobleman, wondered whether the player or the house had the advantage in a variation of the following betting game.6.1 In the basic version, the player rolls four dice, and wins provided none of them are a six. The house collects on the even money bet if at least one six appears.  

    De Méré brought this problem to attention of the French mathematicians Blaise Pascal and Pierre de Fermat, most famous as the source of Fermat's Last Theorem. Together, these men worked out the basics of probability theory, along the way establishing that the house wins the basic version with probability $p = 1 - (5/6)^4 \approx 0.517$, where the probability p = 0.5 would denote a fair game where the house wins exactly half the time. 
     The jai-alai world of our Monte Carlo simulation assumes that we decide the outcome of a point between two teams by flipping a suitably biased coin. If this world were reality, our simulation will compute the correct probability of each possible betting outcome. But all players are not created equal, of course. By doing a statistical study of the outcome of all the matches involving a particular player, we can determine an appropriate amount to bias the coin.

    But such computations only make sense if our simulated jai-alai world is a model consistent with the real world. John von Neuman once said that ``the valuation of a poker hand can be sheer mathematics.'' We have to reduce our evaluation of a pelotari to sheer mathematics.

     

      September 6, 2021 1:31 PM IST
    0
  • Applied probability is an important branch in probability, including computational probability. Since statistics is using probability theory to construct models to deal with data, as my understanding, I am wondering what's the essential difference between statistical model and probability model? Probability model does not need real data? Thanks.

    This post was edited by Vaibhav Mali at August 6, 2021 12:59 PM IST
      August 6, 2021 12:58 PM IST
    0
  • Probability was about permutation, combination, conditional probability. Probability was often explained with dice, coins, colored marbles and other discrete artifacts. Probability is the measure of the likelihood that an event will occur. Although probability can be calculated using statistical models, and probability does not have to be from countable events and a rational number on 0-1, that would be its first meaning. Statistics was about mostly real number measurements, like the t-, z-, W- and U- statistics. A statistic (singular) or sample statistic is a single measure of some attribute of a sample (e.g., its arithmetic mean value).

    There is an overlap or gray zone for probability for the discrete distributions including the binomial distribution, the multinomial distribution, and the Poisson distribution, which are still finger counting, i.e., literally countable probable events, for which statistics are parameters and for statistics for continuous distributions that are real number models of probability.

    Probability models as a first meaning imply countability, for example, the likelihood of getting exactly 5 heads from 10 coin tosses. Statistical models, as a first meaning contain some statistic. However, one can say that "Marginal probability is a statistic" so that using which phrase is used when depends on what connotation one is making.
      August 6, 2021 8:58 PM IST
    0