Reference¶
- haplo.analysis.combine_constantinos_kalapotharakos_split_mcmc_output_files_to_xarray_zarr(split_mcmc_output_directory: Path, combined_output_path: Path, *, elements_per_record: int, overwrite: bool = False, multiprocess_pool_size: int = 1) None¶
Combine Constantinos Kalapotharakos format split mcmc output files into an Xarray Zarr data store.
- Parameters:
split_mcmc_output_directory – The root of the split files.
combined_output_path – The path of the output Zarr file.
elements_per_record – The number of elements per record in the split files. Similar to columns per row, but the files are not organized into rows and columns.
overwrite – Overwrite existing files if they exist. Otherwise, an error will be raised if they exist.
multiprocess_pool_size – The number of processes to handle the conversion process.
- Returns:
None
- haplo.analysis.mcmc_output_xarray_dataset_to_pandas_data_frame(dataset: Dataset, limit_from_end: int | None = None, random_sample_size: int | None = None) DataFrame¶
Converts the MCMC output Xarray dataset to a Pandas data frame.
- Parameters:
dataset – The MCMC output Xarray dataset.
limit_from_end – Limits the number of rows from the end of the dataset.
random_sample_size – Randomly samples this many elements from the dataset. If limit_from_end is set, that limit will be applied first, then this many will be sampled from that subset.
- Returns:
The Pandas data frame.
- haplo.analysis.slice_iteration_of_mcmc_output_xarray_dataset(dataset: Dataset, *, start_iteration: int, end_iteration: int) Dataset¶
Gets a slice of an MCMC output Xarray dataset along the iteration axis.
- Parameters:
dataset – The MCMC output Xarray dataset.
start_iteration – The start of the slice (inclusive).
end_iteration – The end of the slice (exclusive).
- Returns:
The Xarray dataset that is the slice of the original dataset.