Boost C++ Libraries Home Libraries People FAQ More

PrevUpHomeNext

Beta Distribution

#include <boost/math/distributions/beta.hpp>
namespace boost{ namespace math{

 template <class RealType = double,
           class Policy   = policies::policy<> >
class beta_distribution;

// typedef beta_distribution<double> beta;
// Note that this is deliberately NOT provided,
// to avoid a clash with the function name beta.

template <class RealType, class Policy>
class beta_distribution
{
public:
   typedef RealType  value_type;
   typedef Policy    policy_type;
   // Constructor from two shape parameters, alpha & beta:
   beta_distribution(RealType a, RealType b);

   // Parameter accessors:
   RealType alpha() const;
   RealType beta() const;

   // Parameter estimators of alpha or beta from mean and variance.
   static RealType find_alpha(
     RealType mean, // Expected value of mean.
     RealType variance); // Expected value of variance.

   static RealType find_beta(
     RealType mean, // Expected value of mean.
     RealType variance); // Expected value of variance.

   // Parameter estimators from from
   // either alpha or beta, and x and probability.

   static RealType find_alpha(
     RealType beta, // from beta.
     RealType x, //  x.
     RealType probability); // cdf

   static RealType find_beta(
     RealType alpha, // alpha.
     RealType x, // probability x.
     RealType probability); // probability cdf.
};

}} // namespaces

The class type beta_distribution represents a beta probability distribution function.

The beta distribution is used as a prior distribution for binomial proportions in Bayesian analysis.

See also: beta distribution and Bayesian statistics.

How the beta distribution is used for Bayesian analysis of one parameter models is discussed by Jeff Grynaviski.

The probability density function PDF for the beta distribution defined on the interval [0,1] is given by:

f(x;α,β) = xα - 1 (1 - x)β -1 / B(α, β)

where B(α, β) is the beta function, implemented in this library as beta. Division by the beta function ensures that the pdf is normalized to the range zero to unity.

The following graph illustrates examples of the pdf for various values of the shape parameters. Note the α = β = 2 (blue line) is dome-shaped, and might be approximated by a symmetrical triangular distribution.

If α = β = 1, then it is a __space uniform distribution, equal to unity in the entire interval x = 0 to 1. If α __space and β __space are < 1, then the pdf is U-shaped. If α != β, then the shape is asymmetric and could be approximated by a triangle whose apex is away from the centre (where x = half).

Member Functions
Constructor
beta_distribution(RealType alpha, RealType beta);

Constructs a beta distribution with shape parameters alpha and beta.

Requires alpha,beta > 0,otherwise domain_error is called. Note that technically the beta distribution is defined for alpha,beta >= 0, but it's not clear whether any program can actually make use of that latitude or how many of the non-member functions can be usefully defined in that case. Therefore for now, we regard it as an error if alpha or beta is zero.

For example:

beta_distribution<> mybeta(2, 5);

Constructs a the beta distribution with alpha=2 and beta=5 (shown in yellow in the graph above).

Parameter Accessors
RealType alpha() const;

Returns the parameter alpha from which this distribution was constructed.

RealType beta() const;

Returns the parameter beta from which this distribution was constructed.

So for example:

beta_distribution<> mybeta(2, 5);
assert(mybeta.alpha() == 2.);  // mybeta.alpha() returns 2
assert(mybeta.beta() == 5.);   // mybeta.beta()  returns 5
Parameter Estimators

Two pairs of parameter estimators are provided.

One estimates either α __space or β __space from presumed-known mean and variance.

The other pair estimates either α __space or β __space from the cdf and x.

It is also possible to estimate α __space and β __space from 'known' mode & quantile. For example, calculators are provided by the Pooled Prevalence Calculator and Beta Buster but this is not yet implemented here.

static RealType find_alpha(
  RealType mean, // Expected value of mean.
  RealType variance); // Expected value of variance.

Returns the unique value of α   that corresponds to a beta distribution with mean mean and variance variance.

static RealType find_beta(
  RealType mean, // Expected value of mean.
  RealType variance); // Expected value of variance.

Returns the unique value of β   that corresponds to a beta distribution with mean mean and variance variance.

static RealType find_alpha(
  RealType beta, // from beta.
  RealType x, //  x.
  RealType probability); // probability cdf

Returns the value of α   that gives: cdf(beta_distribution<RealType>(alpha, beta), x) == probability.

static RealType find_beta(
  RealType alpha, // alpha.
  RealType x, // probability x.
  RealType probability); // probability cdf.

Returns the value of β   that gives: cdf(beta_distribution<RealType>(alpha, beta), x) == probability.

Non-member Accessor Functions

All the usual non-member accessor functions that are generic to all distributions are supported: Cumulative Distribution Function, Probability Density Function, Quantile, Hazard Function, Cumulative Hazard Function, mean, median, mode, variance, standard deviation, skewness, kurtosis, kurtosis_excess, range and support.

The formulae for calculating these are shown in the table below, and at Wolfram Mathworld.

Applications

The beta distribution can be used to model events constrained to take place within an interval defined by a minimum and maximum value: so it is used in project management systems.

It is also widely used in Bayesian statistical inference.

Related distributions

The beta distribution with both α __space and β = 1 follows a uniform distribution.

The triangular is used when less precise information is available.

The binomial distribution is closely related when α __space and β __space are integers.

With integer values of α __space and β __space the distribution B(i, j) is that of the j-th highest of a sample of i + j + 1 independent random variables uniformly distributed between 0 and 1. The cumulative probability from 0 to x is thus the probability that the j-th highest value is less than x. Or it is the probability that that at least i of the random variables are less than x, a probability given by summing over the Binomial Distribution with its p parameter set to x.

Accuracy

This distribution is implemented using the beta functions beta and incomplete beta functions ibeta and ibetac; please refer to these functions for information on accuracy.

Implementation

In the following table a and b are the parameters α   and β, x is the random variable, p is the probability and q = 1-p.

Function

Implementation Notes

pdf

f(x;α,β) = xα - 1 (1 - x)β -1 / B(α, β)

Implemented using ibeta_derivative(a, b, x).

cdf

Using the incomplete beta function ibeta(a, b, x)

cdf complement

ibetac(a, b, x)

quantile

Using the inverse incomplete beta function ibeta_inv(a, b, p)

quantile from the complement

ibetac_inv(a, b, q)

mean

a/(a+b)

variance

a * b / (a+b)^2 * (a + b + 1)

mode

(a-1) / (a + b - 2)

skewness

2 (b-a) sqrt(a+b+1)/(a+b+2) * sqrt(a * b)

kurtosis excess

kurtosis

kurtosis + 3

parameter estimation

alpha

from mean and variance

mean * (( (mean * (1 - mean)) / variance)- 1)

beta

from mean and variance

(1 - mean) * (((mean * (1 - mean)) /variance)-1)

The member functions find_alpha and find_beta

from cdf and probability x

and either alpha or beta

Implemented in terms of the inverse incomplete beta functions

ibeta_inva, and ibeta_invb respectively.

find_alpha

ibeta_inva(beta, x, probability)

find_beta

ibeta_invb(alpha, x, probability)

References

Wikipedia Beta distribution

NIST Exploratory Data Analysis

Wolfram MathWorld


PrevUpHomeNext