tdads.inference =============== .. py:module:: tdads.inference Classes ------- .. autoapisummary:: tdads.inference.perm_test tdads.inference.diagram_bootstrap tdads.inference.universal_null Module Contents --------------- .. py:class:: perm_test(iterations: int = 20, dims: list = [0], p: float = 2.0, q: float = 2.0, paired: bool = False, n_cores: int = cpu_count() - 1) .. py:method:: __str__() Describe a permutation test procedure based on the number of permutation iterations and whether the groups are paired or unpaired. .. py:method:: compute_loss(diagram_groups) Internal method to compute the loss function from Robinson and Turner 2017. This function should not be called directly. .. py:method:: test(diagram_groups) Run the permutation test. :param `diagram_groups`: The groups of persistence diagrams to be analyzed. :type `diagram_groups`: list of lists :returns: Keys are 'test_statistics' for the test statistic in each dimension, 'permvals' for the null distribution in each dimension and 'p_values' for the p-values in each dimension. For example, `output['p_values']['1']` would give the p-value for the second homological dimension in `self.dims`. :rtype: Dict .. rubric:: Examples >>> # create two groups of persistence diagrams >>> from ripser import ripser >>> import numpy as np >>> data1 = np.random((100,2)) >>> data2 = np.random((100,2)) >>> D1 = ripser(data1) >>> D2 = ripser(data2) >>> group1 = [D1, D2] >>> group2 = [D1, D2] >>> # create perm test object in dimensions 0 and 1 >>> from tdads.inference import permutation_test >>> pt = permutation_test(dims = [0, 1], n_cores = 2) >>> # run test >>> res = pt.test([g1, g2]) >>> # get p-values >>> res['p_values'] Citations --------- Robinson T, Turner K (2017). "Hypothesis testing for topological data analysis." https://link.springer.com/article/10.1007/s41468-017-0008-7. Abdallah H et al. (2021). "Statistical Inference for Persistent Homology applied to fMRI." https://github.com/hassan-abdallah/Statistical_Inference_PH_fMRI/blob/main/Abdallah_et_al_Statistical_Inference_PH_fMRI.pdf. .. py:class:: diagram_bootstrap(diag_fun, dims: list = [0], num_samples: int = 20, distance_mat: bool = False, alpha: float = 0.05) .. py:method:: __str__() Describe a bootstrap procedure based on the number of bootstrap samples, whether or not the input will be a distance matrix and the Type 1 error rate (alpha). .. py:method:: compute(X: numpy.ndarray, thresh: float) Carry out the bootstrap procedure. :param `X`: The input dataset - either raw tabular data or a distance matrix of samples. :type `X`: numpy.ndarray (2D) :param `thresh`: The maximum filtration radius for Vietoris-Rips persistent homology. :type `thresh`: float :returns: Entries are 'diagram' (the computed persistence diagram), 'thresholds' (a Dict of the computed persistence thresholds for each desired dimension) and 'subsetted_diagram' (the persistence diagram thresholded by the threshold values in each dimension). :rtype: Dict .. rubric:: Examples >>> from tdads.inference import diagram_bootstrap >>> from ripser import ripser >>> from numpy.random import uniform >>> # build circle dataset >>> theta = uniform(low = 0, high = 2*pi, size = 100) >>> data = array([[cos(theta[i]), sin(theta[i])] for i in range(100)]) >>> # define persistent homology function >>> def diag_fun(X, thresh): >>> return ripser(X = X, thresh = thresh) >>> # create bootstrap object and compute significant features >>> boot = diagram_bootstrap(diag_fun = diag_fun) >>> res = boot.compute(data, 2) >>> # print subsetted diagram >>> res['subsetted_diagram'] Citations --------- Chazal F et al (2017). "Robust Topological Inference: Distance to a Measure and Kernel Distance." https://www.jmlr.org/papers/volume18/15-484/15-484.pdf. .. py:class:: universal_null(diag_fun, dims: list = [1], distance_mat: bool = False, alpha: float = 0.05, infinite_cycle_inference: bool = False) .. py:method:: __str__() Describe a universal null procedure based on the dimensions being analyzed, whether or not the input will be a distance matrix, the Type 1 error rate (alpha) and whether or not infinite cycle inference will be carried out. .. py:method:: compute(X: numpy.ndarray, thresh) Carry out the universal null inference procedure. :param `X`: The input dataset - either raw tabular data or a distance matrix of samples. :type `X`: numpy.ndarray :param `thresh`: The maximum filtration radius for persistent homology. If 'enclosing' then the enclosing radius of `X` will be computed and used, otherwise `thresh` must be a set number. :type `thresh`: float or 'enclosing' :returns: The entries are 'subsetted_diagram' - the list of subsetted persistence diagrams in each dimension (numpy ndarrays), and 'p_values' - a list of lists for the p-values of each remaining topological feature. :rtype: Dict .. rubric:: Examples >>> from tdads.inference import universal_null >>> from ripser import ripser >>> from numpy.random import uniform, normal >>> # build circle dataset and add noise >>> theta = uniform(low = 0, high = 2*pi, size = 100) >>> data = array([[cos(theta[i]), sin(theta[i])] for i in range(100)]) >>> data = data + normal(scale = 0.2, size = (100, 2)) >>> # define the persistent homology function >>> def diag_fun(X, thresh): >>> return ripser(X = X, thresh = thresh) >>> # create universal null object >>> univ_null = universal_null(diag_fun = diag_fun) >>> # carry out the inference procedure >>> res = univ_null.compute(data) Citations --------- Bobrowski O, Skraba P (2023). "A universal null-distribution for topological data analysis." https://www.nature.com/articles/s41598-023-37842-2.