Reference

haplo.analysis.combine_constantinos_kalapotharakos_split_mcmc_output_files_to_xarray_zarr(split_mcmc_output_directory: Path, combined_output_path: Path, *, elements_per_record: int, overwrite: bool = False, multiprocess_pool_size: int = 1) None

Combine Constantinos Kalapotharakos format split mcmc output files into an Xarray Zarr data store.

Parameters:
  • split_mcmc_output_directory – The root of the split files.

  • combined_output_path – The path of the output Zarr file.

  • elements_per_record – The number of elements per record in the split files. Similar to columns per row, but the files are not organized into rows and columns.

  • overwrite – Overwrite existing files if they exist. Otherwise, an error will be raised if they exist.

  • multiprocess_pool_size – The number of processes to handle the conversion process.

Returns:

None

haplo.analysis.mcmc_output_xarray_dataset_to_pandas_data_frame(dataset: Dataset, limit_from_end: int | None = None, random_sample_size: int | None = None) DataFrame

Converts the MCMC output Xarray dataset to a Pandas data frame.

Parameters:
  • dataset – The MCMC output Xarray dataset.

  • limit_from_end – Limits the number of rows from the end of the dataset.

  • random_sample_size – Randomly samples this many elements from the dataset. If limit_from_end is set, that limit will be applied first, then this many will be sampled from that subset.

Returns:

The Pandas data frame.

haplo.analysis.slice_iteration_of_mcmc_output_xarray_dataset(dataset: Dataset, *, start_iteration: int, end_iteration: int) Dataset

Gets a slice of an MCMC output Xarray dataset along the iteration axis.

Parameters:
  • dataset – The MCMC output Xarray dataset.

  • start_iteration – The start of the slice (inclusive).

  • end_iteration – The end of the slice (exclusive).

Returns:

The Xarray dataset that is the slice of the original dataset.