Patterns in static

Apophenia

Macros | Typedefs | Functions | Variables
apop_asst.c File Reference

Macros

#define omp_threadnum   0
 

Typedefs

typedef int(* apop_fn_riip )(apop_data *, int, int, void *)
 

Functions

char * apop_text_paste (apop_data const *strings, char *between, char *before, char *after, char *between_cols, apop_fn_riip prune, void *prune_parameter)
 
double apop_generalized_harmonic (int N, double s)
 
int apop_system (const char *fmt,...)
 
double * apop_vector_percentiles (gsl_vector *data, char rounding)
 
int apop_regex (const char *string, const char *regex, apop_data **substrings, const char use_case)
 
double apop_rng_GHgB3 (gsl_rng *r, double *a)
 
apop_modelapop_beta_from_mean_var (double m, double v)
 
gsl_rng * apop_rng_get_thread_base (int thread)
 
apop_dataapop_model_draws (apop_model *model, int count, apop_data *draws)
 

Variables

char * apop_nul_string
 

Detailed Description

The odds and ends bin. Copyright (c) 2005–2007, 2010 by Ben Klemens. Licensed under the modified GNU GPL v2; see COPYING and COPYING2.

Function Documentation

apop_model* apop_beta_from_mean_var ( double  m,
double  v 
)

The Beta distribution is useful for modeling because it is bounded between zero and one, and can be either unimodal (if the variance is low) or bimodal (if the variance is high), and can have either a slant toward the bottom or top of the range (depending on the mean).

The distribution has two parameters, typically named $\alpha$ and $\beta$, which can be difficult to interpret. However, there is a one-to-one mapping between (alpha, beta) pairs and (mean, variance) pairs. Since we have good intuition about the meaning of means and variances, this function takes in a mean and variance, calculates alpha and beta behind the scenes, and returns a random draw from the appropriate Beta distribution.

Parameters
mThe mean the Beta distribution should have. Notice that m is in [0,1].
vThe variance which the Beta distribution should have. It is in (0, 1/12), where (1/12) is the variance of a Uniform(0,1) distribution. Funny things happen with variance near 1/12 and mean far from 1/2.
Returns
Returns an apop_beta model with its parameters appropriately set.
Exceptions
out->error=='r'Range error: mean is not within [0, 1].
double apop_generalized_harmonic ( int  N,
double  s 
)

Calculate $\sum_{n=1}^N {1\over n^s}$

  • There are no doubt efficient shortcuts do doing this, but I use brute force. [Though Knuth's Art of Programming v1 doesn't offer anything, which is strong indication of nonexistence.] To speed things along, I save the results so that they can just be looked up should you request the same calculation.
  • If N is zero or negative, return NaN. Notify the user if apop_opts.verbosity >=1

For example:

#include <apop.h>
int main(){
double out = apop_generalized_harmonic(270, 0.0);
assert (out == 270);
out = apop_generalized_harmonic(370, -1.0);
assert (out == 370*371/2);
out = apop_generalized_harmonic(12, -1.0);
assert (out == 12*13/2);
}
apop_data* apop_model_draws ( apop_model model,
int  count,
apop_data draws 
)

Make a set of random draws from a model and write them to an apop_data set.

Parameters
modelThe model from which draws will be made. Must already be prepared and/or estimated.
countThe number of draws to make. If draw_matrix is not NULL, then this is ignored and count=draw_matrix->matrix->size1. default=1000.
drawsIf not NULL, a pre-allocated data set whose matrix element will be filled with draws.
Returns
An apop_data set with the matrix filled with size draws. If draw_matrix!=NULL, then return a pointer to it.
Exceptions
out->error=='m'Input model isn't good for making draws: it is NULL, or m->dsize=0.
out->error=='s'You gave me a draws matrix, but its size is less than the size of a single draw from the data, model->dsize.
out->error=='d'Trouble drawing from the distribution for at least one row. That row is set to all NAN.
  • Prints a warning if you send in a non-NULL apop_data set, but its matrix element is NULL, when apop_opts.verbose>=1.
  • See also apop_draw, which makes a single draw.

Here is a two-line program to draw a different set of ten Standard Normals on every run (provided runs are more than a second apart):

#include <apop.h>
#include <time.h>
int main(){
apop_opts.rng_seed = time(NULL);
.count=10,
)
);
}
int apop_regex ( const char *  string,
const char *  regex,
apop_data **  substrings,
const char  use_case 
)

A convenience function for regular expression searching

  • There are three common flavors of regular expression: Basic, Extended, and Perl-compatible (BRE, ERE, PCRE). I use EREs, as per the specs of your C library, which should match POSIX's ERE specification.

For example, "p.val" will match "P value", "p.value", "p values" (and even "tempeval", so be careful).

If you give a non-NULL address in which to place a table of paren-delimited substrings, I'll return them as a row in the text element of the returned apop_data set. I'll return all the matches, filling the first row with substrings from the first application of your regex, then filling the next row with another set of matches (if any), and so on to the end of the string. Useful when parsing a list of items, for example.

Parameters
stringThe string to search (no default)
regexThe regular expression (no default)
substringsParens in the regex indicate that I should return matching substrings. Give me the address of an apop_data* set, and I will allocate and fill the text portion with matches. Default= NULL, meaning do not return substrings (even if parens exist in the regex). If no match, return an empty apop_data set, so output->textsize[0]==0.
use_caseShould I be case sensitive, 'y' or 'n'? (default = 'n', which is not the POSIX default.)
Returns
Count of matches found. 0 == no match. substrings may be allocated and filled if needed.
  • If apop_opts.stop_on_warning='n' returns -1 on error (e.g., regex NULL or didn't compile).
  • If strings==NULL, I return 0—no match—and if substrings is provided, set it to NULL.
  • Here is the test function. Notice that the substring-pulling function call passes &subs, not plain subs. Also, the non-match has a zero-length blank in subs->text[0][1].
#include <apop.h>
int main(){
char string1[] = "Hello. I am a string.";
assert(apop_regex(string1, "hell"));
apop_data *subs;
apop_regex(string1, "(e).*I.*(xxx)*(am)", .substrings = &subs);
//apop_data_show(subs);
assert(!strcmp(subs->text[0][0], "e"));
assert(!strlen(subs->text[0][1]));
assert(!strcmp(subs->text[0][2], "am"));
//Split a comma-delimited list, throwing out white space.
//Notice that the regex includes only one instance of a non-comma blob
//ending in a non-space followed by a comma, but the function keeps
//applying it until the end of string.
char string2[] = " one, two , three ,four";
apop_regex(string2, " *([^,]*[^ ]) *(,|$) *", &subs);
assert(!strcmp(*subs->text[0], "one"));
assert(!strcmp(*subs->text[1], "two"));
assert(!strcmp(*subs->text[2], "three"));
assert(!strcmp(*subs->text[3], "four"));
//Get a parenthetical. For EREs, \( \) match plain parens in the text.
char string3[] = " one (but secretly, two)";
apop_regex(string3, "(\\([^)]*\\))", &subs);
assert(!strcmp(*subs->text[0], "(but secretly, two)"));
//NULL input string ==> no-op.
int match_count = apop_regex(NULL, " *([^,]*[^ ]) *(,|$) *", &subs);
assert(!match_count);
assert(!subs);
}
  • Each set of matches will be one row of the output data. E.g., given the regex ([A-Za-z])([0-9]), the column zero of outdata will hold letters, and column one will hold numbers. Use apop_data_transpose to reverse this so that the letters are in outdata->text[0] and numbers in outdata->text[1].
double apop_rng_GHgB3 ( gsl_rng *  r,
double *  a 
)

RNG from a Generalized Hypergeometric type B3.

Devroye uses this as the base for many of his distribution-generators, including the Waring.

  • If one of the inputs is <=0, error. Returns GSL_NAN if the function doesn't stop.
int apop_system ( const char *  fmt,
  ... 
)

Call system(), but with printf-style arguments. E.g.,

1 char filenames[] = "apop_asst.c apop_asst.o"
2 apop_system("ls -l %s", filenames);
Returns
The return value of the system() call.
char* apop_text_paste ( apop_data const *  strings,
char *  between,
char *  before,
char *  after,
char *  between_cols,
apop_fn_riip  prune,
void *  prune_parameter 
)

Join together a list or array of strings, with optional separators between the strings.

Parameters
stringsAn apop_data set with a grid of text to be combined into a single string
betweenThe text to put in between the rows of the table, such as ", ". (Default is a single space: " ")
beforeThe text to put at the head of the string. For the query example, this would be .before="select ". (Default: NULL)
afterThe text to put at the tail of the string. For the query example, .after=" from data_table". (Default: NULL)
between_colsThe text to insert between columns of text. See below for an example (Default is set to equal .between)
pruneIf you don't want to use the entire text set, you can provide a function to indicate which elements should be pruned out. Some examples:
1 //Just use column 3
2 int is_not_col_3(apop_data *indata, int row, int col, void *ignore){
3  return col!=3;
4 }
5 
6 //Jump over blanks as if they don't exist.
7 int is_blank(apop_data *indata, int row, int col, void *ignore){
8  return strlen(indata->text[row][col])==0;
9 }
Parameters
prune_parameterA void pointer to pass to your prune function.
Returns
A single string with the elements of the strings table joined as per your specification. Allocated by the function, to be freed by you if desired.
  • If the table of strings is NULL or has no text, I will print only the .before and .after parts with nothing in between.
  • if apop_opts.verbose >=3, then print the pasted text to stderr.
  • This function uses the Designated initializers syntax for inputs.

The sample snippet generates the SQL for a query using a list of column names (where the query begins with select , ends with from datatab, and has commas in between each element), re-processes the same list to produce the head of an HTML table, then produces the body of the table with the query result (pasting the trs and tds into the right places).

#include <apop.h>
int main(){
apop_query("create table datatab(name, age, sex);"
"insert into datatab values ('Alex', 23, 'm');"
"insert into datatab values ('Alex', 32, 'f');"
"insert into datatab values ('Michael', 41, 'f');"
"insert into datatab values ('Michael', 14, 'm');");
apop_data *cols = apop_text_alloc(NULL, 3, 1);
apop_text_add(cols, 0, 0, "name");
apop_text_add(cols, 1, 0, "age");
apop_text_add(cols, 2, 0, "sex");
char *query= apop_text_paste(cols, .before="select ", .between=", ");
apop_data *d = apop_query_to_text("%s from datatab", query);
char *html_head = apop_text_paste(cols, .before="<table><tr><td>",
.between="</td><td>", .after="</tr>\n<tr><td>");
char *html_table = apop_text_paste(d, .before=html_head, .after="</td></tr></table>\n",
.between="</tr>\n<tr><td>", .between_cols="</td><td>");
FILE *outfile = fopen("yourdata.html", "w"); fprintf(outfile, "%s", html_table);
fclose(outfile);
}
double* apop_vector_percentiles ( gsl_vector *  data,
char  rounding 
)

Returns an array of size 101, where returned_vector[95] gives the value of the 95th percentile, for example. Returned_vector[100] is always the maximum value, and returned_vector[0] is always the min (regardless of rounding rule).

Parameters
dataa gsl_vector of data. (No default, must not be NULL.)
roundingThis will either be 'u', 'd', or 'a'. Unless your data is exactly a multiple of 101, some percentiles will be ambiguous. If 'u', then round up (use the next highest value); if 'd' (or anything else), round down to the next lowest value; if 'a', take the mean of the two nearest points. If 'u' or 'a', then you can say "5% or more of the sample is below \c returned_vector[5]"; if 'd' or 'a', then you can say "5% or more of the sample is above returned_vector[5]". (Default = 'd'.)
  • You may eventually want to free() the array returned by this function.
  • This function uses the Designated initializers syntax for inputs.

Autogenerated by doxygen on Wed Oct 15 2014 (Debian 0.999b+ds3-2).