| Title: | A harmonized data resource and software for enrichment analysis of microbial physiologies |
|---|---|
| Description: | bugphyzz is an electronic database of standardized microbial annotations. It facilitates the creation of microbial signatures based on shared attributes, which are utilized for bug set enrichment analysis. The data also includes annotations imputed with ancestra state reconstruction methods. |
| Authors: | Samuel Gamboa [aut, cre] (ORCID: <https://orcid.org/0000-0002-6863-7943>), Levi Waldron [aut] (ORCID: <https://orcid.org/0000-0003-2725-0694>), Kelly Eckenrode [aut], Jonathan Ye [aut], Jennifer Wokaty [aut], NCI [fnd] (GrantNo.: R01CA230551) |
| Maintainer: | Samuel Gamboa <[email protected]> |
| License: | Artistic-2.0 |
| Version: | 1.7.1 |
| Built: | 2026-05-29 07:59:30 UTC |
| Source: | https://github.com/waldronlab/bugphyzz |
getTaxonSignatures returns the names of all of the signatures
associated with a particular taxon. More details can be found in the main
bugphyzz vignette; please run browseVignettes("bugphyzz").
getTaxonSignatures(tax, bp, ...)getTaxonSignatures(tax, bp, ...)
tax |
A valid NCBI ID or taxon name. If taxon name is used, the argument taxIdType = "Taxon_name" must also be used. |
bp |
List of data.frames imported with |
... |
Arguments passed to |
A character vector with the names of the signatures for a taxon.
taxid <- "562" taxonName <- "Escherichia coli" bp <- importBugphyzz() sig_names_1 <- getTaxonSignatures(taxid, bp) sig_names_2 <- getTaxonSignatures(taxonName, bp, taxIdType = "Taxon_name")taxid <- "562" taxonName <- "Escherichia coli" bp <- importBugphyzz() sig_names_1 <- getTaxonSignatures(taxid, bp) sig_names_2 <- getTaxonSignatures(taxonName, bp, taxIdType = "Taxon_name")
importBugphyzz imports bugphyzz annotations as a list of
tidy data.frames. To learn more about the structure of the data.frames
please check the bugphyzz vignette with browseVignettes("bugphyzz") or
'vignette("bugphyzz", "bugphyzz").
importBugphyzz( version = "10.5281/zenodo.12574596", forceDownload = FALSE, v = 0.8, excludeRarely = TRUE )importBugphyzz( version = "10.5281/zenodo.12574596", forceDownload = FALSE, v = 0.8, excludeRarely = TRUE )
version |
Character string indicating the version. Default is the latest release on Zenodo. Options: Zenodo DOI, GitHub commit hash, or devel. |
forceDownload |
Logical value. Force a fresh download of the data or use the one stored in the cache (if available). Default is FALSE. |
v |
Validation value. Default 0.8 (see details). |
excludeRarely |
Default is TRUE. Exclude values with Frequency == FALSE (see details). |
The data structure of the data.frames imported with importBugphyzz are
detailed in the main vignette. Please run browseVignettes("bugphyzz").
v argument)Data imported with importBugphyzz includes annotations imputed through
ancestral state reconstruction (ASR) methods. A 10-fold cross-validation
approach was implemented to assess the reliability of the data imputed.
Mathew's correlation coefficient (MCC) and R-squared (R2) were used for the
validation of discrete and numeric attributes.
Details can be found at: https://github.com/waldronlab/taxPProValidation.
By default, imputed annotations with a MCC or R2 value greater than 0.5 are
imported. The minimum value can be adjusted with the v argument (only
values between 0 and 1).
One of the variables in the bugphyzz data.frames is "Frequency", which
can adopt values of
"always", "usually", "sometimes", "rarely", or "never". By default
"never" and "rarely" are excluded. "rarely" could be included with
excludeRarely = FALSE. To learn more about these frequency keywords
please check the bugphyzz vignette with browseVignettes("bugphyzz").
By default, the datasets imported with the importBugphuzz function
will always return a shortened version of the source. Please use
vigette("sources", "bugphyz") to see the full sources.
A list of tidy data frames.
bp <- importBugphyzz() names(bp)bp <- importBugphyzz() names(bp)
makeSignatures Creates signatures for a list of bug signatures from
a tidy data.frame imported through the importBugphyzz function. Please
run browseVignettes("bugphyz") for detailed examples.
makeSignatures( dat, taxIdType = c("NCBI_ID", "Taxon_name"), taxLevel = c("mixed", "superkingdom", "phylum", "class", "order", "family", "genus", "species", "strain"), evidence = c("exp", "igc", "tas", "nas", "tax", "asr"), frequency = c("always", "usually", "sometimes", "unknown"), minSize = 10, min = NULL, max = NULL )makeSignatures( dat, taxIdType = c("NCBI_ID", "Taxon_name"), taxLevel = c("mixed", "superkingdom", "phylum", "class", "order", "family", "genus", "species", "strain"), evidence = c("exp", "igc", "tas", "nas", "tax", "asr"), frequency = c("always", "usually", "sometimes", "unknown"), minSize = 10, min = NULL, max = NULL )
dat |
A data.frame. |
taxIdType |
A character string. Valid options: NCBI_ID, Taxon_name. |
taxLevel |
A character vector. Taxonomic rank. Valid options: superkingdom, kingdom, phylum, class, order, family, genus, species, strain. They can be combined. "mixed" is equivalent to select all valid ranks. |
evidence |
A character vector. Valid options: exp, igc, nas, tas, tax, asr. They can be combined. Default is all. |
frequency |
A character vector. Valid options: always, usually, sometimes, rarely, unknown. They can be combined. By default, "rarely" is excluded. |
minSize |
Minimum number of bugs in a signature. Default is 10. |
min |
Minimum value (inclusive). Only for numeric attributes. Default is NULL. |
max |
Maximum value (inclusive). Only for numeric attributes. Default is NULL. |
A list of character vectors with scientific names or taxids.
bp <- importBugphyzz() sigs <- purrr::map(bp, makeSignatures) sigs <- purrr::list_flatten(sigs, name_spec = "{inner}")bp <- importBugphyzz() sigs <- purrr::map(bp, makeSignatures) sigs <- purrr::list_flatten(sigs, name_spec = "{inner}")
physiologies imports a list of data.frames. This data is in "raw"
state before cleaning and going through the data imputation steps. It
should be used by developers/curators of the package.
physiologies(keyword = "all", fullSource = FALSE)physiologies(keyword = "all", fullSource = FALSE)
keyword |
Character vector with one or more valid keywords.
Valid keyboards can be checked with |
fullSource |
Logical. If |
A list of data.frames in tidy format.
l <- physiologies('all') df <- physiologies('aerophilicity')[[1]]l <- physiologies('all') df <- physiologies('aerophilicity')[[1]]
showPhys prints the names of the available physiologies that can be
imported with the physiologies function. This function
should be used by developers/curators.
showPhys(whichNames = "all")showPhys(whichNames = "all")
whichNames |
A character string. Options: 'all' (default), 'spreadsheets', 'bacdive'. |
A character vector with the names of the physiologies.
showPhys() showPhys('bacdive') showPhys('spreadsheets')showPhys() showPhys('bacdive') showPhys('spreadsheets')