![]() |
|
Typedefs | |
typedef const char * | ccp |
Functions | |
void | apop_estimate_parameter_tests (apop_model *est) |
gsl_vector * | apop_vector_unique_elements (const gsl_vector *v) |
apop_data * | apop_text_unique_elements (const apop_data *d, size_t col) |
apop_data * | apop_data_to_dummies (apop_data *d, int col, char type, int keep_first, char append, char remove) |
apop_data * | apop_data_to_factors (apop_data *data, char intype, int incol, int outcol) |
apop_data * | apop_data_get_factor_names (apop_data *data, int col, char type) |
apop_data * | apop_text_to_factors (apop_data *d, size_t textcol, int datacol) |
apop_data * | apop_estimate_coefficient_of_determination (apop_model *m) |
Generally, if it assumes something is Normally distributed, it's here.
Factor names are stored in an auxiliary table with a name like "<categories for your_var>"
. Producing this name is annoying (and prevents us from eventually making it human-language independent), so use this function to get the list of factor names.
data | The data set. (No default, must not be NULL ) |
col | The column in the main data set whose name I'll use to check for the factor name list. Vector==-1. (default=0) |
type | If you are referring to a text column, use 't'. (default='d') |
apop_data* apop_data_to_dummies | ( | apop_data * | d, |
int | col, | ||
char | type, | ||
int | keep_first, | ||
char | append, | ||
char | remove | ||
) |
A utility to make a matrix of dummy variables. You give me a single vector that lists the category number for each item, and I'll produce a matrix with a single one in each row in the column specified.
After that, you have to decide what to do with the new matrix and the original data column.
.remove='y'
option specifies that I should use apop_data_rm_columns to remove the column used to generate the dummies. Implemented only for type=='d'
..append='y'
or .append='e'
I will run the above two lines for you. Your apop_data pointer will not change, but its matrix
element will be reallocated (via apop_data_stack)..append='i'
, I will place the matrix of dummies in place, immediately after the data column you had specified. You will probably use this with .remove='y'
to replace the single column with the new set of dummy columns. Bear in mind that if there are two or more dummy columns (which there probably are if you are bothering to use this function), subsequent column numbers will change..append='i'
and you asked for a text column, I will append to the end of the table, which is equivalent to append='e'
.d | The data set with the column to be dummified (No default.) |
col | The column number to be transformed; -1==vector (default = 0) |
type | 'd'==data column, 't'==text column. (default = 't') |
keep_first | if zero, return a matrix where each row has a one in the (column specified MINUS ONE). That is, the zeroth category is dropped, the first category has an entry in column zero, et cetera. If you don't know why this is useful, then this is what you need. If you know what you're doing and need something special, set this to one and the first category won't be dropped. (default = 0) |
append | If 'e' or 'y' , append the dummy grid to the end of the original data matrix. If 'i' , insert in place, immediately after the original data column. (default = 'n' ) |
remove | If 'y' , remove the original data or text column. (default = 'n' ) |
matrix
element is the one-zero matrix of dummies. If you used .append
, then this is the main matrix. Also, I add a page named "\<categories for your_var\>"
giving a reference table of names and column numbers (where your_var
is the appropriate column heading). out->error=='a' | allocation error |
out->error=='d' | dimension error
|
Convert a column of text or numbers into a column of numeric factors, which you can use for a multinomial probit/logit, for example.
If you don't run this on your data first, apop_probit and apop_logit default to running it on the vector or (if no vector) zeroth column of the matrix of the input apop_data set, because those models need a list of the unique values of the dependent variable.
data | The data set to be modified in place. (No default. If NULL , returns NULL and a warning) |
intype | If 't' , then incol refers to text, otherwise ('d' is a good choice) refers to the vector or matrix. Default = 't' . |
incol | The column in the text that will be converted. -1 is the vector. Default = 0. |
outcol | The column in the data set where the numeric factors will be written (-1 means the vector). Default = 0. |
For example:
Notice that the query pulled a column of ones for the sake of saving room for the factors. It reads column zero of the text, and writes it to column zero of the matrix.
Another example:
Here, the type
column is converted to sequential integer factors and those factors overwrite the original data. Since a reference table is added as a second page of the apop_data set, you can recover the original values as needed.
apop_data
set with only one column of text. Also, I add a page named "<categories for your_var>"
giving a reference table of names and column numbers (where your_var
is the appropriate column heading) use apop_data_get_factor_names to retrieve that table.out->error=='a' | allocation error. |
out->error=='d' | dimension error.
|
void apop_estimate_parameter_tests | ( | apop_model * | est | ) |
For many, it is a knee-jerk reaction to a parameter estimation to test whether each individual parameter differs from zero. This function does that.
est | The apop_model, which includes pre-calculated parameter estimates, var-covar matrix, and the original data set. |
Returns nothing. At the end of the routine, est->info->more
includes a set of t-test values: p value, confidence (=1-pval), t statistic, standard deviation, one-tailed Pval, one-tailed confidence.
Deprecated. Use apop_data_to_factors.
Convert a column of text in the text portion of an apop_data
set into a column of numeric elements, which you can use for a multinomial probit, for example.
d | The data set to be modified in place. |
datacol | The column in the data set where the numeric factors will be written (-1 means the vector, which I will allocate for you if it is NULL ) |
textcol | The column in the text that will be converted. |
For example:
Notice that the query pulled a column of ones for the sake of saving room for the factors.
apop_data
set with only one column of text. Also, the more
element is a reference table of names and column numbers.out->error=='d' | dimension error. |
Give me a column of text, and I'll give you a sorted list of the unique elements. This is basically running "select distinct * from datacolumn", but without the aid of the database.
d | An apop_data set with a text component |
col | The text column you want me to use. |
gsl_vector* apop_vector_unique_elements | ( | const gsl_vector * | v | ) |
Give me a vector of numbers, and I'll give you a sorted list of the unique elements. This is basically running "select distinct datacol from data order by datacol", but without the aid of the database.
v | a vector of items |