tomas.infer.get_dbl_mg

tomas.infer.get_dbl_mg(adata, groupby, groups='all', output=None, num_mg=100, kl_cutoff=None, merging_threshold=5, skip_threshold=2, alphaMin=1)

Extract metagenes from raw UMI counts of heterotypic doublets.

Parameters:
  • adata (AnnData) – The (annotated) UMI count matrix of shape n_obs × n_vars. Rows correspond to droplets and columns to genes.

  • groupby (str) – The key of the droplet categories stored in adata.obs.

  • groups (list or str ('all'), optional) – Groups of heterodbls to extract metagenes. List of a groups of valid dbl names or a string ‘all’. The default is ‘all’.

  • output (path) – Path to save the results.

  • num_mg (int, optional) – Number of exclusive meta-genes. The default is 100.

  • kl_cutoff (float, optional) – DESCRIPTION. The default is 1.

  • merging_threshold (float, optional) – Stop merging when alpha sum of metagenes exceeds the threshold in conterpart celltype. The default is 5.

  • skip_threshold (float, optional) – If adding a gene into current metegene leads to ‘skip_threshold’-fold change of the std of alpha values, this gene is skipped. The default is 2.

  • alphaMin (float, optional) – Individual genes with alpha greater than the threshold are skipped in mergging step and considerted as a specifial metagene with one member gene. The default is 1.

Raises:

ValueError – Check the legality of argements.

Returns:

adata_mgdic – Keys are names of heterotypic doublets. Values are UMI counts in metagenes of heterotypic doublets.

Return type:

dic