Normal distribution (正态分布)
Normal distribution {正态分布}
- 1. Normal distribution (正态分布) = Gaussian distribution (高斯分布)
- 1.1. Probability density function (概率密度函数)
- 1.2. Standard normal distribution (标准正态分布)
- 1.3. Cumulative distribution function (累积分布函数)
- 2. 正态分布的性质
- References
normal /ˈnɔːrml/ adj. 正常的,一般的,典型的,精神正常的,意识健全的 n. 法线,常态,一般水平,通常标准
1. Normal distribution (正态分布) = Gaussian distribution (高斯分布)
In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable.
正态分布 (normal distribution) 是一种常见的连续概率分布,物理学中称为高斯分布 (Gaussian distribution)。
若随机变量 X X X 服从一个平均数为 μ \mu μ、标准差为 σ \sigma σ 的正态分布,则记为 X ∼ N ( μ , σ 2 ) {X\sim N(\mu ,\sigma ^{2})} X∼N(μ,σ2),其概率密度函数为 f ( x ) = 1 σ 2 π e − ( x − μ ) 2 2 σ 2 {\displaystyle f(x)={\frac {1}{\sigma {\sqrt {2\pi }}}}\;e^{-{\frac {\left(x-\mu \right)^{2}}{2\sigma ^{2}}}}\!} f(x)=σ2π1e−2σ2(x−μ)2。
The general form of its probability density function is f ( x ) = 1 2 π σ 2 e − ( x − μ ) 2 2 σ 2 f(x) = \frac{1}{\sqrt{2\pi\sigma^2} } e^{-\frac{(x-\mu)^2}{2\sigma^2}} f(x)=2πσ21e−2σ2(x−μ)2.
The parameter
μ
\mu
μ is the mean or expectation of the distribution (and also its median and mode), while the parameter
σ
2
\sigma^2
σ2 is the variance. The standard deviation of the distribution is
σ
\sigma
σ (sigma). A random variable with a Gaussian distribution is said to be normally distributed, and is called a normal deviate.
正态分布的数学期望值
μ
{\displaystyle \mu}
μ,可解释为位置参数,决定了分布的位置;方差
σ
2
{\displaystyle \sigma ^{2}}
σ2 或标准差
σ
{\displaystyle \sigma}
σ 可解释为尺度参数,决定了分布的幅度。
The normal distribution is often referred to as
N
(
μ
,
σ
2
)
{\textstyle N(\mu ,\sigma ^{2})}
N(μ,σ2) or
N
(
μ
,
σ
2
)
{\displaystyle {\mathcal {N}}(\mu ,\sigma ^{2})}
N(μ,σ2). Thus when a random variable
X
{\displaystyle X}
X is normally distributed with mean
μ
{\displaystyle \mu }
μ and standard deviation
σ
{\displaystyle \sigma }
σ, one may write
X
∼
N
(
μ
,
σ
2
)
{\displaystyle X\sim {\mathcal {N}}(\mu ,\sigma ^{2})}
X∼N(μ,σ2).
μ
{\displaystyle \mu }
μ 数学期望 = 中位数 = 众数,
σ
2
>
0
{\displaystyle \sigma ^{2}>0}
σ2>0 方差,
x
∈
(
−
∞
;
+
∞
)
{\displaystyle x\in (-\infty ;+\infty )\!}
x∈(−∞;+∞) 值域,
1.1. Probability density function (概率密度函数)
The red curve is the standard normal distribution (红线代表标准正态分布).
概率密度函数能够表示随机变量每个取值有多大的可能性。
正态分布均值为 μ {\displaystyle \mu} μ 方差为 σ 2 {\displaystyle \sigma ^{2}} σ2 (标准差为 σ {\displaystyle \sigma} σ) 的概率密度函数为
f ( x ; μ , σ ) = 1 σ 2 π exp ( − ( x − μ ) 2 2 σ 2 ) {\displaystyle f(x;\mu ,\sigma )={\frac {1}{\sigma {\sqrt {2\pi }}}}\,\exp \left(-{\frac {(x-\mu )^{2}}{2\sigma ^{2}}}\right)} f(x;μ,σ)=σ2π1exp(−2σ2(x−μ)2)
A normal distribution is sometimes informally called a bell curve.
正态分布的概率密度函数曲线呈钟形,因此又称为钟形曲线。
The normal distribution with density f ( x ) {\textstyle f(x)} f(x) (mean μ {\displaystyle \mu } μ and variance σ 2 > 0 {\textstyle \sigma ^{2}>0} σ2>0) has the following properties:
- It is symmetric around the point
x
=
μ
{\textstyle x=\mu}
x=μ, which is at the same time the mode, the median and the mean of the distribution.
正态分布的概率密度函数关于平均值对称。 - It is unimodal: its first derivative is positive for x < μ {\textstyle x<\mu} x<μ, negative for x > μ {\textstyle x>\mu} x>μ, and zero only at x = μ {\textstyle x=\mu} x=μ.
- The area bounded by the curve and the x {\displaystyle x} x-axis is unity (i.e. equal to one).
- Its first derivative is f ′ ( x ) = − x − μ σ 2 f ( x ) {\textstyle f'(x)=-{\frac {x-\mu }{\sigma ^{2}}}f(x)} f′(x)=−σ2x−μf(x).
- Its second derivative is f ′ ′ ( x ) = ( x − μ ) 2 − σ 2 σ 4 f ( x ) {\textstyle f''(x)={\frac {(x-\mu )^{2}-\sigma ^{2}}{\sigma ^{4}}}f(x)} f′′(x)=σ4(x−μ)2−σ2f(x).
- Its density has two inflection points (where the second derivative of
f
{\displaystyle f}
f is zero and changes sign), located one standard deviation away from the mean, namely at
x
=
μ
−
σ
{\textstyle x=\mu -\sigma }
x=μ−σ and
x
=
μ
+
σ
{\textstyle x=\mu +\sigma}
x=μ+σ.
函数曲线的拐点 (inflection point) 为离平均数一个标准差距离的位置。 - Its density is log-concave.
- Its density is infinitely differentiable, indeed supersmooth of order 2.
1.2. Standard normal distribution (标准正态分布)
The simplest case of a normal distribution is known as the standard normal distribution or unit normal distribution. This is a special case when μ = 0 {\textstyle \mu =0} μ=0 and σ 2 = 1 {\textstyle \sigma ^{2}=1} σ2=1, and it is described by this probability density function (or density)
φ ( z ) = e − z 2 2 2 π . {\displaystyle \varphi (z)={\frac {e^{\frac {-z^{2}}{2}}}{\sqrt {2\pi }}}\,.} φ(z)=2πe2−z2.
The variable z {\displaystyle z} z has a mean of 0 and a variance and standard deviation of 1. The density φ ( z ) {\textstyle \varphi (z)} φ(z) has its peak 1 2 π {\textstyle {\frac {1}{\sqrt {2\pi }}}} 2π1 at z = 0 {\textstyle z=0} z=0 and inflection points at z = + 1 {\textstyle z=+1} z=+1 and z = − 1 {\displaystyle z=-1} z=−1.
inflection /ɪnˈflekʃn/ n. (尤指词尾的) 屈折变化,语调的抑扬变化
如果一个随机变量 X {\displaystyle X} X 服从正态分布,则记为 X ∼ N ( μ , σ 2 ) {\displaystyle X} \sim {\displaystyle N(\mu ,\sigma ^{2})} X∼N(μ,σ2)。如果 μ = 0 {\displaystyle \mu = 0} μ=0 并且 σ = 1 {\displaystyle \sigma = 1} σ=1,则这个正态分布被称为标准正态分布,可以简化为
f ( x ) = 1 2 π exp ( − x 2 2 ) {\displaystyle f(x)={\frac {1}{\sqrt {2\pi }}}\,\exp \left(-{\frac {x^{2}}{2}}\right)} f(x)=2π1exp(−2x2)
The probability density of the standard Gaussian distribution (standard normal distribution, with zero mean and unit variance) is often denoted with the Greek letter ϕ {\displaystyle \phi } ϕ (phi). The alternative form of the Greek letter phi, φ {\displaystyle \varphi} φ, is also used quite often.
Furthermore, the density φ {\displaystyle \varphi} φ of the standard normal distribution (i.e. μ = 0 {\textstyle \mu =0} μ=0 and σ = 1 {\textstyle \sigma =1} σ=1) also has the following properties:
- Its first derivative is φ ′ ( x ) = − x φ ( x ) {\textstyle \varphi '(x)=-x\varphi (x)} φ′(x)=−xφ(x).
- Its second derivative is φ ′ ′ ( x ) = ( x 2 − 1 ) φ ( x ) {\textstyle \varphi ''(x)=(x^{2}-1)\varphi (x)} φ′′(x)=(x2−1)φ(x).
- The probability that a normally distributed variable X {\displaystyle X} X with known μ {\displaystyle \mu } μ and σ 2 {\textstyle \sigma ^{2}} σ2 is in a particular set, can be calculated by using the fact that the fraction Z = ( X − μ ) / σ {\textstyle Z=(X-\mu )/\sigma } Z=(X−μ)/σ has a standard normal distribution.
1.3. Cumulative distribution function (累积分布函数)
累积分布函数是指随机变量 X {\displaystyle X} X 小于或等于 x {\displaystyle x} x 的概率,用概率密度函数表示为
F ( x ; μ , σ ) = 1 σ 2 π ∫ − ∞ x exp ( − ( t − μ ) 2 2 σ 2 ) d t {\displaystyle F(x;\mu ,\sigma )={\frac {1}{\sigma {\sqrt {2\pi }}}}\int _{-\infty }^{x}\exp \left(-{\frac {(t-\mu )^{2}}{2\sigma ^{2}}}\ \right)\,dt} F(x;μ,σ)=σ2π1∫−∞xexp(−2σ2(t−μ)2 )dt
For a generic normal distribution with density f {\displaystyle f} f, mean μ {\displaystyle \mu } μ and variance σ 2 {\textstyle \sigma ^{2}} σ2, the cumulative distribution function is
F ( x ) = Φ ( x − μ σ ) = 1 2 [ 1 + erf ( x − μ σ 2 ) ] . {\displaystyle F(x)=\Phi {\left({\frac {x-\mu }{\sigma }}\right)}={\frac {1}{2}}\left[1+\operatorname {erf} \left({\frac {x-\mu }{\sigma {\sqrt {2}}}}\right)\right]\,.} F(x)=Φ(σx−μ)=21[1+erf(σ2x−μ)].
正态分布的累积分布函数能够由一个叫做误差函数的特殊函数表示为
Φ ( z ) = 1 2 [ 1 + erf ( z − μ σ 2 ) ] {\displaystyle \Phi (z)={\frac {1}{2}}\left[1+\operatorname {erf} \left({\frac {z-\mu }{\sigma {\sqrt {2}}}}\right)\right]} Φ(z)=21[1+erf(σ2z−μ)]
The cumulative distribution function (CDF) of the standard normal distribution, usually denoted with the capital Greek letter {\displaystyle \Phi }, is the integral
Φ ( x ) = 1 2 π ∫ − ∞ x e − t 2 / 2 d t . {\displaystyle \Phi (x)={\frac {1}{\sqrt {2\pi }}}\int _{-\infty }^{x}e^{-t^{2}/2}\,dt\,.} Φ(x)=2π1∫−∞xe−t2/2dt.
标准正态分布的累积分布函数习惯上记为 Φ {\displaystyle \Phi } Φ,它仅仅是指 μ = 0 {\displaystyle \mu =0} μ=0, σ = 1 {\displaystyle \sigma =1} σ=1 时的值
Φ ( x ) = F ( x ; 0 , 1 ) = 1 2 π ∫ − ∞ x exp ( − t 2 2 ) d t {\displaystyle \Phi (x)=F(x;0,1)={\frac {1}{\sqrt {2\pi }}}\int _{-\infty }^{x}\exp \left(-{\frac {t^{2}}{2}}\right)\,dt} Φ(x)=F(x;0,1)=2π1∫−∞xexp(−2t2)dt
标准正态分布用误差函数表示的公式简化为
Φ ( z ) = 1 2 [ 1 + erf ( z 2 ) ] {\displaystyle \Phi (z)={\frac {1}{2}}\left[1+\operatorname {erf} \left({\frac {z}{\sqrt {2}}}\right)\right]} Φ(z)=21[1+erf(2z)]
2. 正态分布的性质
如果 X ∼ N ( μ , σ 2 ) {\displaystyle X\sim N(\mu ,\sigma ^{2})\,} X∼N(μ,σ2) 且 a {\displaystyle a} a 与 b {\displaystyle b} b 是实数,那么 a X + b ∼ N ( a μ + b , ( a σ ) 2 ) {\displaystyle aX+b\sim N(a\mu +b,(a\sigma )^{2})} aX+b∼N(aμ+b,(aσ)2).
如果 X ∼ N ( μ X , σ X 2 ) {\displaystyle X\sim N(\mu _{X},\sigma _{X}^{2})} X∼N(μX,σX2) 与 Y ∼ N ( μ Y , σ Y 2 ) {\displaystyle Y\sim N(\mu _{Y},\sigma _{Y}^{2})} Y∼N(μY,σY2) 是统计独立的正态随机变量,那么
- 它们的和满足正态分布
U
=
X
+
Y
∼
N
(
μ
X
+
μ
Y
,
σ
X
2
+
σ
Y
2
)
{\displaystyle U=X+Y\sim N(\mu _{X}+\mu _{Y},\sigma _{X}^{2}+\sigma _{Y}^{2})}
U=X+Y∼N(μX+μY,σX2+σY2)
它们的差满足正态分布 V = X − Y ∼ N ( μ X − μ Y , σ X 2 + σ Y 2 ) {\displaystyle V=X-Y\sim N(\mu _{X}-\mu _{Y},\sigma _{X}^{2}+\sigma _{Y}^{2})} V=X−Y∼N(μX−μY,σX2+σY2) - U {\displaystyle U} U 与 V {\displaystyle V} V 两者是相互独立的,要求 X X X 与 Y Y Y 的方差相等。
References
[1] Yongqiang Cheng, https://yongqiang.blog.csdn.net/
[2] Normal distribution, https://en.wikipedia.org/wiki/Normal_distribution