t检验的工作原理和在Python中的实现

2018年08月11日由 yuxiangyu 发表 502083 0

t检验也许使用最广泛的统计假设检验之一。

因此，总有一天你可能会使用t检验，深入了解它的工作原理非常重要。作为开发人员，通过从头开始实现假设检验以理解。

在本教程中，你将了解如何在Python中从头开始实现t检验。

完成本教程后，你将了解：

假设样本来自同一种群，t检验将评论是否可能观察到两个样本。

如何从头开始为两个独立样本实现t检验。

如何从头开始对两个相关样本实现成对t检验。

让我们开始吧。

本教程分为三个部分; 他们是：

t-测试

独立样本的t检验

相关样本的t检验

t检验

t检验（Student’s t-Test）是一种统计假设检验，用来检验两个样本是否如逾期一样来自同一人群。

它以威廉·戈塞（William Sealy Gosset）使用的化名“ Student” 命名，他开发了这项检验。

这个检验通过检查来自两个样品的平均值来确定它们是否有显著的差异。通过计算均值之间差异的标准误差来做到这一点，两个样本是否具有相同的均值（零假设），可以解释为差异的可能性有多少。

通过检验计算出的t统计量可以通过与t分布临界值进行比较来解释。可以使用自由度和百分点函数（PPF）的显著性水平来计算临界值。

我们可以在双侧检验中解释统计量，这意味着如果我们拒绝零假设，那可能是因为第一个均值小于或大于第二个均值。为此，我们可以计算检验统计量的绝对值，并将其与正（右侧）临界值进行比较，如下所示：

如果abs（t-statistic）<=临界值：接受零假设即均值相等。

如果abs（t-statistic）>临界值：拒绝零假设。

我们还可以使用t分布的累积分布函数（CDF）来检索观察t统计量的绝对值的累积概率，从而计算出p值。然后可以将p值与选定的显著性水平（alpha，例如0.05）进行比较，以确定是否可以拒绝零假设：

如果p> alpha：接受零假设。

如果p <= alpha：拒绝零假设。

在使用样本的均值时，这个检验假设两个样本都是从高斯分布中提取的。检验还假设样本具有相同的方差和相同的大小，尽管如果这些假设不成立，会对检验进行校正。例如，参见Welch’s t-test。

t检验有两个主要版本：

独立样本。两个样本不相关的情况。

相关样本。样本相关的情况，例如对同一种群的重复测量。也称为配对检验（paired test）。

Python中，独立和相关的t检验分别通过SciPy的ttest_ind（）和ttest_rel（）函数提供。

注：我建议使用这些SciPy函数为你的程序计算t检验（如果它们合适的话）。这些库的实现更快，且更不容易出错。我只建议你出于学习目的自行实现这个检验，或者在需要修改检验版本的情况下。

我们将使用SciPy函数来确认我们自己的检验版本的结果。

请注意，作为参考，本教程中提供的所有计算都直接取自2010年第三版“Statistics in Plain English ”中的第9章。

独立样本的t检验

我们从最常见的t检验开始：我们比较两个独立样本均值的情况。

计算

两个独立样本的t统计量的计算如下：

t = observed difference between sample means / standard error of the difference between the means

或者

t = (mean(X1) - mean(X2)) / sed

其中X1和X2是第一个和第二个数据样本，而sed是均值之差的标准误差。

均值之间差异的标准误差可以计算如下：

sed = sqrt(se1^2 + se2^2)

其中se1和se2是第一个和第二个数据集的标准误差。

样本的标准误差可以计算为：

se = std / sqrt(n)

其中se是样本的标准误差，std是样本标准差，n是样本中观察的数量。

这些计算做出以下假设：

样本是从高斯分布中提取的。

每个样本的大小大致相等。

样本具有相同的方差。

实现

我们可以使用Python标准库，NumPy和SciPy中的函数轻松实现这些方程。

假设我们的两个数据样本存储在变量data1和data2中。

我们可以从计算这些样本的均值开始，如下所示：

# calculate means

mean1, mean2 = mean(data1), mean(data2)

现在我们需要计算标准误差。

我们可以手动计算它，首先计算样本标准差：

# calculate sample standard deviations

std1, std2 = std(data1, ddof=1), std(data2, ddof=1)

然后计算标准误差：

# calculate standard errors

n1, n2 = len(data1), len(data2)

se1, se2 = std1/sqrt(n1), std2/sqrt(n2)

另外，我们可以使用SciPy中的sem（）函数直接计算标准误差。

# calculate standard errors

se1, se2 = sem(data1), sem(data2)

我们可以使用样本的标准误差来计算“ 样本之间差异的标准误差 ”：

# standard error on the difference between the samples

sed = sqrt(se1**2.0 + se2**2.0)

我们现在可以计算t统计量：

# calculate the t statistic
t_stat = (mean1 - mean2) / sed

我们还可以计算一些其他值来帮助解释和呈现统计数据。

检验的自由度数为两个样本中观察值之和减去2。

# degrees of freedom

df = n1 + n2 - 2

对于给定的显著性水平，可以使用百分点函数（PPF）计算临界值，例如0.05（95％置信度）。

此功能可用于SciPy中的t分布，如下所示：

# calculate the critical value

alpha = 0.05

cv = t.ppf(1.0 - alpha, df)

p值可以用t分布上的累积分布函数来计算（同样在SciPy中）。

# calculate the p-value

p = (1 - t.cdf(abs(t_stat), df)) * 2

在这里，我们假设一个双侧分布，其中零假设的拒绝可以解释为第一个均值小于或大于第二个均值。

我们可以将这些部分组合成一个简单的函数来计算两个独立样本的t检验：

# function for calculating the t-test for two independent samples

def independent_ttest(data1, data2, alpha):

	# calculate means

	mean1, mean2 = mean(data1), mean(data2)

	# calculate standard errors

	se1, se2 = sem(data1), sem(data2)

	# standard error on the difference between the samples

	sed = sqrt(se1**2.0 + se2**2.0)

	# calculate the t statistic

	t_stat = (mean1 - mean2) / sed

	# degrees of freedom

	df = len(data1) + len(data2) - 2

	# calculate the critical value

	cv = t.ppf(1.0 - alpha, df)

	# calculate the p-value

	p = (1.0 - t.cdf(abs(t_stat), df)) * 2.0

	# return everything

	return t_stat, df, cv, p

工作示例

在本节中，我们将计算一些合成数据样本的t检验。

首先，让我们生成两个100高斯随机数的样本，其方差相同为5，均值不同为50和51。我们期望检验拒绝零假设并找出样本之间的显著差异：

# seed the random number generator

seed(1)

# generate two independent samples

data1 = 5 * randn(100) + 50

data2 = 5 * randn(100) + 51

我们可以使用内置的SciPy的ttest_ind（）函数计算这些样本的t检验。它会为我们提供t统计量和p值以进行比较，确保我们已正确实现了检验。

完整的示例：

# Student's t-test for independent samples

from numpy.random import seed

from numpy.random import randn

from scipy.stats import ttest_ind

# seed the random number generator

seed(1)

# generate two independent samples

data1 = 5 * randn(100) + 50

data2 = 5 * randn(100) + 51

# compare samples

stat, p = ttest_ind(data1, data2)

print('t=%.3f, p=%.3f' % (stat, p))

运行该示例，我们可以得到t统计量和p值。

我们使用这些作为我们对这些数据进行检验的期望值。

t=-2.262, p=0.025

我们现在可以使用上一节中定义的函数对相同的数据应用我们自己的实现。

这个函数将返回t统计量和临界值。我们可以使用临界值来解释t统计量，以查看检验的结果是否显著，并且均值是否确实与我们预期的不同。

# interpret via critical value

if abs(t_stat) <= cv:

	print('Accept null hypothesis that the means are equal.')

else:

	print('Reject the null hypothesis that the means are equal.')

该函数还返回一个p值。我们可以使用alpha来解释p值，例如0.05，以确定测试的结果是否显著，均值是否确实与我们预期的不同。

# interpret via p-value

if p > alpha:

	print('Accept null hypothesis that the means are equal.')

else:

	print('Reject the null hypothesis that the means are equal.')

我们希望这两种解释始终匹配。

完整的示例：

# t-test for independent samples

from math import sqrt

from numpy.random import seed

from numpy.random import randn

from numpy import mean

from scipy.stats import sem

from scipy.stats import t



# function for calculating the t-test for two independent samples

def independent_ttest(data1, data2, alpha):

	# calculate means

	mean1, mean2 = mean(data1), mean(data2)

	# calculate standard errors

	se1, se2 = sem(data1), sem(data2)

	# standard error on the difference between the samples

	sed = sqrt(se1**2.0 + se2**2.0)

	# calculate the t statistic

	t_stat = (mean1 - mean2) / sed

	# degrees of freedom

	df = len(data1) + len(data2) - 2

	# calculate the critical value

	cv = t.ppf(1.0 - alpha, df)

	# calculate the p-value

	p = (1.0 - t.cdf(abs(t_stat), df)) * 2.0

	# return everything

	return t_stat, df, cv, p



# seed the random number generator

seed(1)

# generate two independent samples

data1 = 5 * randn(100) + 50

data2 = 5 * randn(100) + 51

# calculate the t test

alpha = 0.05

t_stat, df, cv, p = independent_ttest(data1, data2, alpha)

print('t=%.3f, df=%d, cv=%.3f, p=%.3f' % (t_stat, df, cv, p))

# interpret via critical value

if abs(t_stat) <= cv:

	print('Accept null hypothesis that the means are equal.')

else:

	print('Reject the null hypothesis that the means are equal.')

# interpret via p-value

if p > alpha:

	print('Accept null hypothesis that the means are equal.')

else:

	print('Reject the null hypothesis that the means are equal.')

相关样本的t检验

现在，我们可以看看如何计算相关样本的t检验。

在这种情况下，我们收集种群中样本的一些观察，然后应用一些处理，再从同一样本收集观察。

结果是两个相同大小的样本，其中每个样本中的观察是相关的或者成对的。

相关样本的t检验称为成对t检验。

计算

成对t检验的计算与独立样本的情况类似。

主要区别在于分母的计算。

t = (mean(X1) - mean(X2)) / sed

其中X1和X2是第一个和第二个数据样本，而sed是均值差异的标准误差。

这里，sed计算如下：

sed = sd / sqrt(n)

其中sd是相关样本均值之差的标准差，n是成对观察的总数（如，每个样本的大小）。

计算sd首先需要计算样本之间的平方差之和：

d1 = sum (X1[i] - X2[i])^2 for i in n

还需要样本之间（非平方）差异的总和：

d2 = sum (X1[i] - X2[i]) for i in n

然后我们可以将sd计算为：

sd = sqrt((d1 - (d2**2 / n)) / (n - 1))

实现

我们可以直接在Python中实现成对t检验的计算。

第一步是计算每个样本的均值。

# calculate means

mean1, mean2 = mean(data1), mean(data2)

接下来，我们将要求对的数量（n）。我们会在后续几个不同的计算中使用它。

# number of paired samples

n = len(data1)

然后，我们必须计算样本之间差的平方之和，以及差的和。

# sum squared difference between observations

d1 = sum([(data1[i]-data2[i])**2 for i in range(n)])

# sum difference between observations

d2 = sum([data1[i]-data2[i] for i in range(n)])

我们现在可以计算均值之差的标准差。

# standard deviation of the difference between means

sd = sqrt((d1 - (d2**2 / n)) / (n - 1))

然后用它来计算均值之间差异的标准误差。

# standard error of the difference between the means

sed = sd / sqrt(n)

最后，获得了计算t统计量所需的所有量。

# calculate the t statistic

t_stat = (mean1 - mean2) / sed

这个实现与独立示例的实现之间唯一关键区别是计算自由度的数量。

# degrees of freedom

df = n - 1

和以前一样，我们可以这些结合在一起，形成一个可重用的函数。这个函数将采用两个成对样本和一个显著性水平(alpha)，计算t统计量、自由度数、临界值和p值。

完整的函数如下：

# function for calculating the t-test for two dependent samples

def dependent_ttest(data1, data2, alpha):

	# calculate means

	mean1, mean2 = mean(data1), mean(data2)

	# number of paired samples

	n = len(data1)

	# sum squared difference between observations

	d1 = sum([(data1[i]-data2[i])**2 for i in range(n)])

	# sum difference between observations

	d2 = sum([data1[i]-data2[i] for i in range(n)])

	# standard deviation of the difference between means

	sd = sqrt((d1 - (d2**2 / n)) / (n - 1))

	# standard error of the difference between the means

	sed = sd / sqrt(n)

	# calculate the t statistic

	t_stat = (mean1 - mean2) / sed

	# degrees of freedom

	df = n - 1

	# calculate the critical value

	cv = t.ppf(1.0 - alpha, df)

	# calculate the p-value

	p = (1.0 - t.cdf(abs(t_stat), df)) * 2.0

	# return everything

	return t_stat, df, cv, p

工作示例

在本节中，我们将在工作示例中使用与独立t检验相同的数据集。

数据样本不是成对的，但我们将假装它们成对。我们期望检验拒绝零假设并找出样本之间的显著差异。

# seed the random number generator

seed(1)

# generate two independent samples

data1 = 5 * randn(100) + 50

data2 = 5 * randn(100) + 51

同样地，我们可以使用SciPy函数评估检验问题，以计算成对t检验。在本例中，使用的是ttest_rel（）函数。

完整的示例：

# Paired Student's t-test

from numpy.random import seed

from numpy.random import randn

from scipy.stats import ttest_rel

# seed the random number generator

seed(1)

# generate two independent samples

data1 = 5 * randn(100) + 50

data2 = 5 * randn(100) + 51

# compare samples

stat, p = ttest_rel(data1, data2)

print('Statistics=%.3f, p=%.3f' % (stat, p))

运行该示例计算并打印t统计量和p值。

我们将使用这些值来验证我们成对t检验函数的计算。

Statistics=-2.372, p=0.020

我们现在可以检验自己成对t检验的实现。

以下是完整的示例，包括已开发的函数和函数结果的解释：

# t-test for dependent samples

from math import sqrt

from numpy.random import seed

from numpy.random import randn

from numpy import mean

from scipy.stats import t



# function for calculating the t-test for two dependent samples

def dependent_ttest(data1, data2, alpha):

	# calculate means

	mean1, mean2 = mean(data1), mean(data2)

	# number of paired samples

	n = len(data1)

	# sum squared difference between observations

	d1 = sum([(data1[i]-data2[i])**2 for i in range(n)])

	# sum difference between observations

	d2 = sum([data1[i]-data2[i] for i in range(n)])

	# standard deviation of the difference between means

	sd = sqrt((d1 - (d2**2 / n)) / (n - 1))

	# standard error of the difference between the means

	sed = sd / sqrt(n)

	# calculate the t statistic

	t_stat = (mean1 - mean2) / sed

	# degrees of freedom

	df = n - 1

	# calculate the critical value

	cv = t.ppf(1.0 - alpha, df)

	# calculate the p-value

	p = (1.0 - t.cdf(abs(t_stat), df)) * 2.0

	# return everything

	return t_stat, df, cv, p



# seed the random number generator

seed(1)

# generate two independent samples (pretend they are dependent)

data1 = 5 * randn(100) + 50

data2 = 5 * randn(100) + 51

# calculate the t test

alpha = 0.05

t_stat, df, cv, p = dependent_ttest(data1, data2, alpha)

print('t=%.3f, df=%d, cv=%.3f, p=%.3f' % (t_stat, df, cv, p))

# interpret via critical value

if abs(t_stat) <= cv:

	print('Accept null hypothesis that the means are equal.')

else:

	print('Reject the null hypothesis that the means are equal.')

# interpret via p-value

if p > alpha:

	print('Accept null hypothesis that the means are equal.')

else:

	print('Reject the null hypothesis that the means are equal.')

运行该示例计算成对t检验。

计算出的t统计量和p值与我们期望的SciPy库实现相匹配。这表明实现是正确的。

用临界值解释t检验统计量，用显著性水平解释p值，均得到显著结果，拒绝了均值相等的零假设。

t=-2.372, df=99, cv=1.660, p=0.020

Reject the null hypothesis that the means are equal.

Reject the null hypothesis that the means are equal.

如果希望深入了解，可以访问本节资源。

书

Statistics in Plain English，第三版，2010年。

API

scipy.stats.ttest_ind API：https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html

scipy.stats.ttest_rel API：https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_rel.html

scipy.stats.sem API：https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.sem.html

scipy.stats.t API：https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.t.html

总结

在本教程中，你了解了如何在Python中从头开始实现t检验。

具体来说，你学到了：

假设样本来自同一种群，t检验将评论是否可能观察到两个样本。

如何从头开始为两个独立样本实现t检验。

如何从头开始对两个相关样本实现成对t检验。

标签：

Python 学习人工智能教程

0 评论

欢迎关注ATYUN官方公众号

商务合作及内容投稿请联系邮箱:bd@atyun.com

上一篇用不确定性来解释和调试你的深度学习模型

下一篇为什么我们一定要用随机权重初始化神经网络

评论登录

要发表评论，您必须先登录。

jonatasgrosman/wav2vec2-large-xlsr-53-english facebook/dino-vitb16 bert-base-uncased xlm-roberta-large xlm-roberta-base gpt2 microsoft/resnet-50 facebook/dino-vits8

最好的基于Transformer的LLM（上）