samplingsimulatorpy package¶
Submodules¶
samplingsimulatorpy.draw_samples module¶
-
samplingsimulatorpy.draw_samples.draw_samples(pop, reps, sample_size)¶ Draws samples of various sizes from a population
Parameters: - pop (pd.DataFrame) – The virtual population as a dataframe
- reps (integer) – The number of replications for each sample size as an integer
- sample_size (list) – The sample size for each one of the samples as a list
Returns: A dataframe containing the sample numbers and sample values
Return type: pd.DataFrame
Raises: TypeError– pop input is a valid data frameTypeError– pop name input is a valid stringTypeError– reps input is an integerValueError– reps input is greater than 0TypeError– sample_size array contains only integers
Examples
>>> pop = generate_virtual_pop(100, np.random.normal, 0, 1) >>> samples = draw_samples(pop, 3, [5, 10, 15, 20])
samplingsimulatorpy.generate_virtual_pop module¶
-
samplingsimulatorpy.generate_virtual_pop.generate_virtual_pop(size, population_name, distribution_func, *para)¶ Create a virtual population
Parameters: - size (int) – The size of the virtual population
- population_name (str) – The population_name of the virtual population
- distribution_func (func) – The function that came from numpy.random
- *para (int) – The parameters the distribution_func is using
Returns: The virtual population as a dataframe
Return type: pd.DataFrame
Raises: ValueError– size input is greater than 0TypeError– size input is an integerTypeError– *para number of parameters for the distribution function
Examples
>>> from samplingsimulatorpy.generate_virtual_pop import generate_virtual_pop >>> pop = generate_virtual_pop(100, "Height", np.random.normal, 0, 1)
samplingsimulatorpy.plot_sample_hist module¶
-
samplingsimulatorpy.plot_sample_hist.plot_sample_hist(pop, samples)¶ Creates a facetted plot of sample histograms from a population
Parameters: - pop (pd.DataFrame) – The virtual population as a dataframe
- samples (pd.DataFrame) – The samples as a dataframe
Returns: A grid of the sample distribution plots
Return type: altair.vegalite.v3.api.Chart
Raises: TypeError– if pop input is not a valid data frameTypeError– if pop input is an empty data frameValueError– pop input should only contain numeric valuesTypeError– if samples input is not a valid data frameValueError– samples input should only contain numeric values
Examples
>>> pop = generate_virtual_pop(100, "variable", normal, 0, 1) >>> samples = draw_samples(pop, 3, [5, 10, 15, 20]) >>> plot_sample_hist(pop, samples)
samplingsimulatorpy.plot_sampling_hist module¶
-
samplingsimulatorpy.plot_sampling_hist.plot_sampling_hist(samples)¶ Create a gird of sampling distribution histogram of the mean of different sample sizes drawn from a population
Parameters: samples (pd.DataFrame) – The samples as a dataframe. It should be an object created by draw_samples function. Otherwise, it should follow the column names of the output of the draw_samples function. If not, the function may not work.
Returns: A facet chart of the sampling distribution plots
Return type: altair.vegalite.v3.api.FacetChart
Raises: TypeError– if samples input is not a valid data frameValueError– samples input should only contain numeric valuesValueError– samples data frame should have only 4 columnsKeyError– samples input should contain ‘replicate’, ‘size’, and ‘rep_size’ columns
Examples
>>> pop = generate_virtual_pop(1000, "Variable", normal, 0, 1) >>> samples = draw_samples(pop, 100, [5, 10, 15, 20]) >>> plot_sampling_hist(samples)
samplingsimulatorpy.samplingsimulatorpy module¶
samplingsimulatorpy.stat_summary module¶
-
samplingsimulatorpy.stat_summary.stat_summary(population, samples, parameter)¶ This function creates a summary stats for population, samples and parameter(s) of interest
Parameters: - population (pd.DataFrame) – The virtual population
- samples (pd.DataFrame) – The drawed samples
- parameter (list) – The list of parameters
Raises: TypeError– population input should be a dataframe contains valueTypeError– samples input should be a dataframe contains valueTypeError– parameter input should be a list contains valueAttributeError– parameter is interest for the summary stats
Returns: The summary stats as a dataframe
Return type: pd.DataFrame
Examples
>>> from samplingsimulatorpy.stat_summary import stat_summary >>> stat_summary(pop, samples, [np.mean, np.std])