Reference¶

haplo.analysis.combine_constantinos_kalapotharakos_split_mcmc_output_files_to_xarray_zarr(split_mcmc_output_directory: Path, combined_output_path: Path, *, elements_per_record: int, overwrite: bool = False, multiprocess_pool_size: int = 1) → None¶

Combine Constantinos Kalapotharakos format split mcmc output files into an Xarray Zarr data store.

Parameters:

split_mcmc_output_directory – The root of the split files.
combined_output_path – The path of the output Zarr file.
elements_per_record – The number of elements per record in the split files. Similar to columns per row, but the files are not organized into rows and columns.
overwrite – Overwrite existing files if they exist. Otherwise, an error will be raised if they exist.
multiprocess_pool_size – The number of processes to handle the conversion process.

Returns:

None

haplo.analysis.mcmc_output_xarray_dataset_to_pandas_data_frame(dataset: Dataset, limit_from_end: int | None = None, random_sample_size: int | None = None) → DataFrame¶

Converts the MCMC output Xarray dataset to a Pandas data frame.

Parameters:

dataset – The MCMC output Xarray dataset.
limit_from_end – Limits the number of rows from the end of the dataset.
random_sample_size – Randomly samples this many elements from the dataset. If limit_from_end is set, that limit will be applied first, then this many will be sampled from that subset.

Returns:

The Pandas data frame.

haplo.analysis.slice_iteration_of_mcmc_output_xarray_dataset(dataset: Dataset, *, start_iteration: int, end_iteration: int) → Dataset¶

Gets a slice of an MCMC output Xarray dataset along the iteration axis.

Parameters:

dataset – The MCMC output Xarray dataset.
start_iteration – The start of the slice (inclusive).
end_iteration – The end of the slice (exclusive).

Returns:

The Xarray dataset that is the slice of the original dataset.