Proposed

12
Ecotype Functional Differentiation
Do ecotypes within a species differ in their COG functional profiles?
High Medium (2-3 weeks) Cog Analysis
Lifestyle-Based COG Stratification
How does lifestyle (free-living vs host-associated) affect pangenome functional composition?
High Low (1 week) Cog Analysis
Openness vs Functional Composition
Do "open" vs "closed" pangenomes show different COG enrichment patterns?
Scale to 100-200 Species
Do COG enrichment patterns hold at larger scale? Are there phylum-specific deviations?
Medium Low (computational time, minimal analyst time) Cog Analysis
Gene Copy Number Variation
Beyond presence/absence, do adaptive vs housekeeping genes show different copy number patterns?
Medium Medium (new analysis type) Cog Analysis
Phylum-Specific Patterns Deep Dive
Which phyla deviate from universal COG enrichment patterns and why?
Medium Medium Cog Analysis
ANI Distance vs Ecotype Divergence
How does genomic distance (ANI) relate to functional divergence between ecotypes?
Medium Medium Ecotype Analysis
Temporal Evolution via ANI
Does COG enrichment pattern change with evolutionary distance?
Low High (complex analysis) Cog Analysis
Composite COG Function Networks
Do multi-functional genes (composite COGs like "LV", "EGP") represent functional modules?
Low High Cog Analysis
Environmental Context of Core Gene Trade-offs
Can we connect the lab-measured trade-offs to natural environment data? Do organisms from more varia...
The 48 Accessory Modules
What are the 48 co-regulated gene modules that are <50% core? Are they mobile elements, niche-specif...
Plasmid vs Chromosomal Gene Functional Profiles
Do plasmid-borne genes show different COG profiles than chromosomal genes?
Low High (need plasmid annotation) New

In Progress

4
Truly Dark Genes — What Remains Unknown After Modern Annotation?
Among the ~6,400 FB genes that remain hypothetical even after bakta v1.12.0 reannotation, what disti...
Pangenome Openness, Metabolic Pathways, and Biogeography
Do pangenome characteristics (open vs. closed) correlate with metabolic pathway diversity and biogeo...
Pan-bacterial Fitness Modules via ICA
Can robust ICA decomposition of RB-TnSeq fitness compendia reveal conserved functional modules acros...
Metabolic Capability vs Metabolic Dependency
Just because a bacterium's genome encodes a complete amino acid biosynthesis or carbon utilization p...

Completed

21
Cross-Organism Essential Gene Families
Using FB's ortholog table, identify essential gene families conserved across multiple species. Are...
The 5,526 "Costly + Dispensable" Genes
PGP Gene Distribution Across Environments & Pangenomes
BacDive Phenotype Signatures of Metal Tolerance
Community Metabolic Ecology via NMDC × Pangenome Integration
Lab Fitness Predicts Field Ecology at Oak Ridge
Field vs Lab Gene Importance in DvH
Essential Gene Conservation Analysis
Quantitative Fitness Effects vs Conservation
Fitness Modules × Pangenome Conservation
Core Gene Paradox — Why Are Core Genes More Burdensome?
Co-fitness Predicts Co-inheritance
Cross-Project Synthesis
AlphaEarth Embeddings, Geography & Environment
Ecotype Reanalysis — Environmental-Only Samples
ADP1 Triple Essentiality Concordance
ADP1 Deletion Collection Phenotype Analysis
Aromatic Catabolism Support Network in ADP1
Condition-Specific Respiratory Chain Wiring in ADP1
Counter Ion Effects on Metal Fitness Measurements
Metabolic Consistency of Pseudomonas FW300-N2E3