![]() |
|
Functions | |
apop_data * | apop_t_test (gsl_vector *a, gsl_vector *b) |
apop_data * | apop_paired_t_test (gsl_vector *a, gsl_vector *b) |
apop_data * | apop_f_test (apop_model *est, apop_data *contrast) |
apop_data * | apop_test_anova_independence (apop_data *d) |
apop_data * | apop_anova (char *table, char *data, char *grouping1, char *grouping2) |
double | apop_test (double statistic, char *distribution, double p1, double p2, char tail) |
apop_data* apop_anova | ( | char * | table, |
char * | data, | ||
char * | grouping1, | ||
char * | grouping2 | ||
) |
This function produces a traditional one- or two-way ANOVA table. It works from data in an SQL table, using queries of the form select data from table group by grouping1, grouping2
.
table | The table to be queried. Anything that can go in an SQL from clause is OK, so this can be a plain table name or a temp table specification like (select ... ) , with parens. |
data | The name of the column holding the count or other such data |
grouping1 | The name of the first column by which to group data |
grouping2 | If this is NULL , then the function will return a one-way ANOVA. Otherwise, the name of the second column by which to group data in a two-way ANOVA. |
apop_data* apop_f_test | ( | apop_model * | est, |
apop_data * | contrast | ||
) |
Runs an F-test specified by q
and c
. Your best bet is to see the chapter on hypothesis testing in Modeling With Data, p 309. It will tell you that:
and that's what this function is based on.
est | an apop_model that you have already calculated. (No default) |
contrast | The matrix ![]() ![]() NULL , it is set to the identity matrix with the top row missing. If the vector is NULL , it is set to a zero matrix of length equal to the height of the contrast matrix. Thus, if the entire apop_data set is NULL or omitted, we are testing the hypothesis that all but ![]() |
apop_data
set with a few variants on the confidence with which we can reject the joint hypothesis. NULL
contrast set, I will generate the set of linear contrasts that are equivalent to the ANOVA-type approach. Readers of {Modeling with Data}, note that there's a bug in the book that claims that the traditional ANOVA approach also checks that the coefficient for the constant term is also zero; this is not the custom and doesn't produce the equivalence presented in that and other textbooks.out->error='a' | Allocation error. |
out->error='d' | dimension-matching error. |
out->error='i' | matrix inversion error. |
out->error='m' | GSL math error.
|
double apop_test | ( | double | statistic, |
char * | distribution, | ||
double | p1, | ||
double | p2, | ||
char | tail | ||
) |
This is a convenience function to do the lookup of a given statistic along a given distribution. You give me a statistic, its (hypothesized) distribution, and whether to use the upper tail, lower tail, or both. I will return the odds of a Type I error given the model—in statistician jargon, the -value. [Type I error: odds of rejecting the null hypothesis when it is true.]
For example,
will return the density of the standard Normal distribution that is more than 1.3 from zero. If this function returns a small value, we can be confident that the statistic is significant. Or,
will give the appropriate odds for an upper-tailed test using the -distribution with 10 degrees of freedom (e.g., a
-test of the null hypothesis that the statistic is less than or equal to zero).
Several more distributions are supported; see below.
statistic | The scalar value to be tested. |
distribution | The name of the distribution; see below. |
p1 | The first parameter for the distribution; see below. |
p2 | The second parameter for the distribution; see below. |
tail | 'u' = upper tail; 'l' = lower tail; anything else = two-tailed. (default = two-tailed) |
Here is a list of distributions you can use, and their parameters.
"normal"
or "gaussian"
"lognormal"
"uniform"
"t"
"chi squared"
, "chi"
, "chisq"
:
"f"