descriptive statistics with weights for simple case
assumes that the data is 1d or 2d with (nobs,nvars) ovservations in rows, variables in columns, and that the same weight apply to each column.
If degrees of freedom correction is used than weights should add up to the number of observations. ttest also assumes that the sum of weights corresponds to the sample size.
This is essentially the same as replicating each observations by it’s weight, if the weights are integers.
Examples
Note: I don’t know the seed for the following, so the numbers will differ
>>> x1_2d = 1.0 + np.random.randn(20, 3)
>>> w1 = np.random.randint(1,4, 20)
>>> d1 = DescrStatsW(x1_2d, weights=w1)
>>> d1.mean
array([ 1.42739844, 1.23174284, 1.083753 ])
>>> d1.var
array([ 0.94855633, 0.52074626, 1.12309325])
>>> d1.std_mean
array([ 0.14682676, 0.10878944, 0.15976497])
>>> tstat, pval, df = d1.ttest_mean(0)
>>> tstat; pval; df
array([ 9.72165021, 11.32226471, 6.78342055])
array([ 1.58414212e-12, 1.26536887e-14, 2.37623126e-08])
44.0
>>> tstat, pval, df = d1.ttest_mean([0, 1, 1])
>>> tstat; pval; df
array([ 9.72165021, 2.13019609, 0.52422632])
array([ 1.58414212e-12, 3.87842808e-02, 6.02752170e-01])
44.0
#if weithts are integers, then asrepeats can be used
>>> x1r = d1.asrepeats()
>>> x1r.shape
...
>>> stats.ttest_1samp(x1r, [0, 1, 1])
...
Methods
asrepeats() | get array that has repeats given by floor(weights) |
confint_mean([alpha]) | |
corrcoef() | correlation coefficient with default ddof for standard deviation |
cov() | covariance |
demeaned() | |
mean() | |
nobs() | alias for number of observations/cases, equal to sum of weights |
std() | |
std_ddof([ddof]) | |
std_mean() | standard deviation of mean |
std_var() | |
sum() | |
sum_weights() | |
sumsquares() | |
ttest_mean(value[, alternative]) | ttest of Null hypothesis that mean is equal to value. |
ttest_meandiff(other) | |
var() | variance with default degrees of freedom correction |
var_ddof([ddof]) |