Resources – ENIGMA

cMonkey1/2

PI: Nitin S. Baliga

URL: https://baliga.systemsbiology.net/projects/cmonkey/
Contact: Serdar Turkarslan | sturkarslan@systemsbiology.org
Source Code: https://github.com/baliga-lab/cmonkey2

Reference: Reiss, D.J.; C.L. Plaisier, W.J. Wu, N.S. Baliga (2015) cMonkey2: Automated, systematics, integrated detection of co-regulated gene modules for any organism. Nucleic Acids Research. [doi]:10.1093/nar/gkv300 {PMID}:25873626 PMCID:PMC4513845

Reference: Reiss, D.J.; N.S. Baliga, R. Bonnaeu (2006) Integrated biclustering of heterogeneous genome-wide datasets to influence global regulatory networks. BMC Bioinformatics. [doi]:10.1186/1471-2105-7-280 {PMID}:16749936 PMCID:PMC1502140

cMonkey detects putative co-regulated gene groupings by integrating the bi-clustering of gene expression data and various functional associations with the de novo detection of sequence motifs. cMonkey2 is the Python implementation of the cMonkey algorithm based on the original R implementation.

CORAL: Contextual Ontology-Based Repository Analysis Library

PI: John-Marc Chandonia

URL: https://coral-demo.lbl.gov/
Contact: John-Marc Chandonia | JMChandonia@lbl.gov

CORAL is a framework for rigorous self-validated data modeling and integrative, reproducible data analysis. CORAL enables new complex data types to be defined on the fly by users, thus avoiding the high maintenance costs of creating specialized data models for every new dataset, but also ensures that such data types are documented in the formal and rigorous manner that is necessary to adhere to all four FAIR principles (https://www.nature.com/articles/sdata201618). In particular, by formally describing all ENIGMA data using a common ontological framework, CORAL emphasizes the Interoperability and Reusability of all our data. In addition to storing ENIGMA data, CORAL includes rich functionality to make the system useful for data analysis, visualization, and managerial oversight. This functionality includes graphing tools, an advanced search, and upload wizard, and an API for merging data, and integration with Jupyter notebooks.

Reference: Novichkov, PS*; Chandonia, J-M*; Arkin, AP. (2022) CORAL: A framework for rigorous self-validated data modeling and integrative, reproducible data analysis. GigaScience [DOI]:1093/gigascience/giac089 {PMID}:36251274 (PMCID):PMC9575582 OSTI: 1888047

Curated BLAST for Genomes

PI: Adam P. Arkin

URL: https://papers.genomics.lbl.gov/curated
Contact: Adam P. Arkin | aparkin@lbl.gov

Reference: https://doi.org/10.1128/mSystems.00072-19
Curated BLAST for Genomes finds candidate genes for a process or an enzymatic activity within a genome of interest. In contrast to annotation tools, which usually predict a single activity for each protein, Curated BLAST asks if any of the proteins in the genome are similar to characterized proteins that are relevant.

EGRIN2

PI: Nitin S. Baliga

URL: http://egrin2.systemsbiology.net
Contact: Serdar Turkarslan | sturkarslan@systemsbiology.org
Source Code: https://github.com/baliga-lab?&q=egrin.

Reference: Brooks, A.N.; D.J. Reiss, A. Allard, W.J. Wu, D.M. Salvanha, C.L. Plaisier, S. Chandrasekaran, M. Pan, A. Kaur, N.S. Baliga (2014) A system-level model for the microbial regulatory genome. Molecular Systems Biology [doi]:10.15252/msb.20145160 {PMID}:25028489 PMCID:PMC4299497

EGRIN 2.0 is a systems-level model that delineates the complex relationship between environment, gene regulation, and phenotype in prokaryotes.

FastTree

PI: Adam P. Arkin

URL: http://www.microbesonline.org/fasttree/
Contact: Adam P. Arkin | aparkin@lbl.gov

Reference: Price, M.N., Dehal, P.S., and Arkin, A.P. (2010) FastTree 2 — Approximately Maximum-Likelihood Trees for Large Alignments. PLoS ONE, 5(3):e9490. doi:10.1371/journal.pone.0009490

FastTree infers approximately maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. FastTree can handle alignments with up to a million sequences in a reasonable amount of time and memory. For large alignments, FastTree is 100-1,000 times faster than PhyML 3.0 or RAxML 7. FastTree is open-source software — you can download the code.

Fitness Browser

PI: Adam P. Arkin
URL: http://fit.genomics.lbl.gov/
Contact: Morgan Price | funwithwords26@gmail.com

Reference: Price, M.N.; K.M. Wetmore, R.J. Waters, M. Callaghan, J. Ray, H. Liu, J.V. Kuehl, R.A. Melnyk, J.S. Lamson, Y. Suh, H.K. Carlson, Z. Esquivel, H. Sadeeshkumar, R. Chakraborty, G.M. Zane, B.E. Rubin, J.D. Wall, A. Visel, J. Bristow, M.J. Blow, A.P. Arkin and A.M. Deutschbauer (2018) Mutant Phenotypes for Thousands of Bacterial Genes of Unknown Function. Nature. [doi]:10.1038/s41586-018-0124-0 {PMID}:29769716

Genome-wide mutant fitness data from diverse bacteria.

GapMind: Automated Annotation of Amino Acid Biosynthesis and Carbon Catabolism

PI: Adam P. Arkin

URL: https://papers.genomics.lbl.gov/gaps
Contact: Adam P. Arkin | aparkin@lbl.gov

Reference: https://doi.org/10.1128/mSystems.00291-20
GapMind is a web-based tool for annotating amino acid biosynthesis and carbon catabolism in bacteria and archaea. GapMind incorporates many variant pathways, and it analyzes a genome in just 15 s. To avoid error-prone transitive annotations, GapMind relies primarily on a database of experimentally characterized proteins. GapMind correctly handles fusion proteins and split proteins, which often cause errors for best-hit approaches.

GLAMM: Genome-Linked Application for Metabolic Maps

PI: Adam P. Arkin

URL: http://glamm.lbl.gov/
Contact: Dylan Chivian | dcchivian@lbl.gov

Reference: Bates, J.T.; D. Chivian, A.P. Arkin (2011) GLAMM: Genome-Linked Application for Metabolic Maps. Nucleic Acids Research. [doi]:10.1093/nar/gkr433{PMID}:21624891 PMCID:PMC3125797

The Genome-Linked Application for Metabolic Maps (GLAMM) is a unified web interface for visualizing metabolic networks, reconstructing metabolic networks from annotated genome data, visualizing experimental data in the context of metabolic networks, and investigating the construction of the novel, transgenic pathways. This simple, user-friendly interface is tightly integrated with the comparative genomics tools of MicrobesOnline [Dehal et al. (, 2010) Nucleic Acids Research, 38, D396–D400]. GLAMM is available for free to the scientific community.

Inferelator

PI: Nitin S. Baliga

URL: https://baliga.systemsbiology.net/the-inferelator/
Contact: Serdar Turkarslan | sturkarslan@systemsbiology.org
Source Code: https://github.com/baliga-lab/cMonkeyNwInf

Reference: Bonneau, R.; D.J. Reiss, P. Shannon, M. Facciotti, L. Hood, N.S. Baliga, V. Thorsson (2006) The Inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo. Genome Biology. [doi]:10.1186/gb-2006-7-5-r36 {PMID}:16686963 PMCID:PMC1779511

The Inferelator is an algorithm for inferring predictive regulatory networks from gene expression data.

Jorg: A method to Help Circularize Improve Metagenome-Assembled Genomes

PI: Adam Arkin

URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008972

Reference: A method to help circularize and improve genomes assembled from short-read shotgun metagenomics data.

KBase: DOE Systems Biology Knowledgebase

PI: Adam P. Arkin

URL: http://kbase.us/
ENIGMA data stored at KBase →
Contact: John-Marc Chandonia | JMChandonia@lbl.gov

Reference: Arkin, A.P.; R.W. Cottingham, C.S. Henry, N.L. Harris, R.L. Stevens, S. Maslov, et al. (2018) KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. [doi]:10.1038/nbt.4163 {PMID}:29979655 PMCID:PMC6870991

Open-source software and data platforms enable data sharing, integration, and analysis of microbes, plants, and communities. KBase maintains an internal reference database that consolidates information from widely used external data repositories, such as Genomes, Protein, Protein Interaction data, and shortly: Communities data, Expression data, MAK biclusters, Phenotype data, etc.

METLIN: Metabolite and Tandem MS Database

PI: Gary Siuzdak

URL: http://metlin.scripps.edu/index.php
Contact: Paul Benton | hpbenton@scripps.edu

Reference: Zhu, Z.J; A.W. Schultz, J. Wang, C.H. Johnson, S.M. Yannone, G.J. Patti and G. Siuzdak (2013) Liquid chromatography quadrupole time-of-flight mass spectrometry characterization of metabolites guided by the METLIN database. Nature Protocols. [doi]:10.1038/nprot.2013.004. {PMID}:23391889 PMCID:PMC3666335

Over 75,000 endogenous and exogenous metabolites are repositories that include metabolites from essentially any living creature, including bacteria.

MicrobesOnline

PI: Adam P. Arkin

URL: http://microbesonline.org
Contact: Morgan Price | funwithwords26@gmail.com

Reference: Dehal, P.S.; M.P. Joachimiak, M.N. Price, J.T. Bates, J.K. Baumohl, D. Chivian, G.D. Friedland, K.H. Huang, K. Keller, P.S. Novichkov, I.L. Dubchak, E.J. Alm and A.P. Arkin (2010) MicrobesOnline: an integrated portal for comparative and functional genomics. Nucleic Acids Research. [doi]:10.1093/nar/gkp919 {PMID}:19906701 PMCID:PMC2808868

Database, browser, and tools for comparative and functional genomics.

MicroDesign

PI: Jizhong Zhou

URL: http://www.ou.edu/ieg/tools/data-analysis-pipeline.html
Contact: Naijia Xiao | naijia.xiao@ou.edu

Reference: Shi, Z.; Yin, H., Van Nostrand, J.D., Voordeckers, J.W., Tu, Q., Deng, Y., Yuan, M., Zhou, A., Zhang, P., Xiao, N., Ning, D., He, Z., Wu, L., Zhou, J. (2019) Functional Gene Array-Based Ultrasensitive and Quantitative Detection of Microbial Populations in Complex Communities. mSystems. [doi]:10.1128/mSystems.00296-19 {PMID}:31213523 PMCID:PMC6581690

This pipeline was developed to comprehensively analyze microbial functional genes for the design of Functional Gene Microarrays (FGAs). The program contains multiple modules to download gene sequences, remove low-homology sequences, design oligonucleotide probes, check probe specificity, output selected oligonucleotide probes, and finally store data in local databases. This pipeline is currently in beta testing by internal ENIGMA members, and the designed high throughput microarrays are available to all applicable ENIGMA studies.

Network Portal

PI: Nitin S. Baliga

URL: http://networks.systemsbiology.net/
Contact: Serdar Turkarslan | sturkarslan@systemsbiology.org
Source Code: https://github.com/baliga-lab/network_portal

Reference: Turkarslan, S.; E.J. Wurtmann, W.J. Wu, N. Jiang, J.C. Bare, K. Foley, D.J. Reiss, P. Novichkov, N.S. Baliga (2014) Network Portal: A Database for Storage, Analysis, and Visualization of Biological Networks. Nucleic Acid Research. [doi]:10.1093/nar/gkt1190 {PMID}:24271392 PMCID:PMC3964938

The Network Portal is a database of gene regulatory networks and enables exploration, annotation, and comparative analysis for 13 species.

D. vulgaris Hildenborougxz Regulatory Network Within Network Portal

URL: http://networks.systemsbiology.net/dvu

A visual interface for the analysis of D. vulgaris Hildenborough Regulatory Network within the Network Portal. We have initiated a program to perform a high-quality reconstruction of the transcriptional regulatory network (TRN) of DvH and build a predictive model for transcriptional control of its physiology.

OpenMSI: Open Mass Spectrometry Imaging

PI: Trent Northen

URL: https://openmsi.nersc.gov/openmsi/client
Contact: Ben Bowen | bpbowen@lbl.gov

Reference: Rübel, O.; A. Greiner, S. Cholia, K. Louie, E.W. Bethel, T.R. Northen, and B.P. Bowen (2013) OpenMSI: A High-Performance Web-Based Platform for Mass Spectrometry Imaging. Analytical Chemistry. [doi]:10.1021/ac402540a {PMID}:24087878

Advanced science gateway for web-based visualization, analysis, and sharing of huge metabolic images.

PaperBLAST

PI: Adam P. Arkin

URL: http://papers.genomics.lbl.gov/
Contact: Morgan Price | funwithwords26@gmail.com

Reference: Price M.N.; and A.P. Arkin (2017) PaperBLAST: Text-mining papers for information about homologs. mSystems. [doi]:10.1128/mSystems.00039-17

Find papers about a protein or its homologs.

RegPrecise

PI: Pavel Novichkov

URL: https://regprecise.lbl.gov/
Contact: John-Marc Chandonia | JMChandonia@lbl.gov

Reference: Novichkov, P.S; A.E. Kazakov, D.A. Ravcheev, S.A. Leyn, G.Y. Kovaleva, R.A. Sutormin, M.D. Kazanov, W. Riehl, A.P. Arkin, I. Dubchak and D.A. Rodionov (2013) RegPrecise 3.0 — A resource for genome-scale exploration of transcriptional regulation in bacteria. BMC Genomics. [doi]:10.1186/1471-2164-14-745 {PMID}:24175918 PMCID:PMC3840689

Database for capturing, visualization, and analysis of transcription factor regulons reconstructed by comparative genomic approaches in various prokaryotic genomes.

Syntrophy Portal

PI: Nitin S. Baliga

URL: http://networks.systemsbiology.net/syntrophy/
Contact: Serdar Turkarslan | sturkarslan@systemsbiology.org

Reference: Turkarslan, S.; A.V. Raman, A.W. Thompson, C.E. Arens, M.A. Gillespie, F. von Netzer, K.L. Hillesland, S. Stolyar, A. López García de Lomana, D.J. Reiss, D. Gorman-Lewis, G.M. Zane, J.A. Ranish, J.D. Wall, D.A. Stahl, N.S. Baliga (2017) Mechanism for Microbial Population Collapse in a Fluctuating Resource Environment. Molecular Systems Biology. [doi]:10.15252/msb.20167058 {PMID}:28320772 PMCID:PMC5371734

Syntrophy Portal is a web tool to explore gene regulatory network models, genome annotations, and genomic variants for Desulfovibrio vulgaris Hildenborough and Methanococcus maripaludis S2.

XCMS Online: Scripps Center for Metabolomics

PI: Gary Siuzdak

URL: https://xcmsonline.scripps.edu/
Contact: Paul Benton | hpbenton@scripps.edu

Reference: Huan, T.; E.M. Forsberg, D. Rinehart, C.H. Johnson, J. Ivanisevic, H.P. Benton, M. Fang, A. Aisporna, B. Hilmers, F.L. Poole, M.P. Thorgersen, M.W.W. Adams, G. Krantz, M.W. Fields, P.D. Robbins, L.J. Niedernhofer, T. Ideker, E.L. Majumder, J.D. Wall, N.J.W. Rattray, R. Goodacre, L.L. Lairson, and G. Siuzdak (2017) Systems biology guided by XCMS Online metabolomics. Nature Methods. [doi]:10.1038/nmeth.4260 {PMID}:28448069 PMCID:PMC5933448

Cloud-based metabolomic data processing platform that provides high-quality metabolic analysis in a user-friendly, web-based format.

Web of Microbes

PI: Trent Northen

URL: http://www.webofmicrobes.org/
Publication Reference Link: https://bmcmicrobiol.biomedcentral.com/articles/10.1186/s12866-018-1256-y

Reference: WoM provides manually curated, direct biochemical observations on the changes to metabolites in an environment after exposure to microorganisms. The web interface displays several key features: (1) the metabolites present in a controlled environment before inoculation or microbial activation, (2) heat map-like displays showing metabolite increases or decreases resulting from microbial activities, (3) a metabolic web displaying the actions of multiple organisms on a specified metabolite pool, (4) metabolite interaction scores indicating an organism’s interaction level with its environment, the potential for metabolite exchange with other organisms and potential for competition with other organisms, and (5) downloadable datasets for integration with other types of -omics datasets.