04 Patient Ecology
Jupyter notebook from the CF Protective Microbiome Formulation Design project.
NB04: Patient Ecology & Engraftability¶
Project: CF Protective Microbiome Formulation Design
Goal: Test H3 — which species are ubiquitous, metabolically active, and safe for formulation?
Engraftability = prevalence × activity. Species that are common across patients AND transcriptionally active (high metaRS/metaG ratio) are most likely to persist in a probiotic formulation.
Input: ~/protect/gold/ (metaG/metaRS CPM, abundance, patient metadata, bridge table)
Output: data/species_engraftability.tsv
In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')
GOLD = Path.home() / 'protect' / 'gold'
DATA = Path('..') / 'data'
FIGS = Path('..') / 'figures'
sns.set_style('whitegrid')
plt.rcParams['figure.dpi'] = 120
metag = pd.read_parquet(GOLD / 'fact_metag_cpm.snappy.parquet')
metars = pd.read_parquet(GOLD / 'fact_metars_cpm.snappy.parquet')
patients = pd.read_parquet(GOLD / 'dim_patient_sample.snappy.parquet')
bridge = pd.read_parquet(GOLD / 'bridge_isolate_metagenomics.snappy.parquet')
isolates = pd.read_parquet(GOLD / 'dim_isolate.snappy.parquet')
pa_comp = pd.read_parquet(GOLD / 'fact_pa_competitors.snappy.parquet')
print(f'MetaG CPM: {len(metag)} rows, {metag.species.nunique()} species, {metag["sample"].nunique()} samples')
print(f'MetaRS CPM: {len(metars)} rows, {metars.species.nunique()} species, {metars["sample"].nunique()} samples')
MetaG CPM: 9916 rows, 134 species, 74 samples MetaRS CPM: 9916 rows, 134 species, 74 samples
1. Species Prevalence & Abundance Across Patients¶
In [2]:
# Per-species metrics from metaG
metag['cpm'] = pd.to_numeric(metag['cpm'], errors='coerce')
metars['cpm'] = pd.to_numeric(metars['cpm'], errors='coerce')
n_samples_g = metag['sample'].nunique()
n_samples_r = metars['sample'].nunique()
# Prevalence: fraction of samples where species is detected (CPM > 0)
prev_g = metag[metag.cpm > 0].groupby('species')['sample'].nunique() / n_samples_g
prev_r = metars[metars.cpm > 0].groupby('species')['sample'].nunique() / n_samples_r
# Mean abundance
abund_g = metag.groupby('species')['cpm'].mean()
abund_r = metars.groupby('species')['cpm'].mean()
# Combine
ecology = pd.DataFrame({
'prevalence_metag': prev_g,
'prevalence_metars': prev_r,
'mean_cpm_metag': abund_g,
'mean_cpm_metars': abund_r
}).fillna(0)
# Activity ratio: metaRS / metaG (transcriptional activity per unit DNA)
ecology['activity_ratio'] = (ecology.mean_cpm_metars + 1) / (ecology.mean_cpm_metag + 1)
# Engraftability score: prevalence × log(activity_ratio + 1)
ecology['engraftability'] = ecology.prevalence_metag * np.log1p(ecology.activity_ratio)
print(f'Species in metagenomics: {len(ecology)}')
print(f'\nTop 20 by engraftability:')
print(ecology.nlargest(20, 'engraftability').round(3).to_string())
Species in metagenomics: 134
Top 20 by engraftability:
prevalence_metag prevalence_metars mean_cpm_metag mean_cpm_metars activity_ratio engraftability
species
Moraxella nonliquefaciens 0.743 0.743 33.505 5335.421 154.658 3.752
Propionibacterium acidifaciens 0.851 0.851 71.755 4435.920 60.985 3.513
Catonella morbi 0.784 0.784 202.738 17677.401 86.770 3.507
Fusobacterium vincentii 0.743 0.743 1362.724 137080.570 100.520 3.434
Alloprevotella sp900095835 0.770 0.770 200.120 8880.138 44.158 2.935
Haemophilus_A sputorum 0.919 0.919 266.315 6087.589 22.777 2.912
Corynebacterium pseudodiphtheriticum 0.905 0.905 172.675 3189.286 18.369 2.683
Actinomyces sp000220835 0.878 0.878 802.105 15660.080 19.501 2.653
Mycobacterium abscessus 0.824 0.824 271.959 5620.054 20.593 2.533
Actinomyces johnsonii 0.959 0.959 4195.893 54119.828 12.895 2.525
Actinomyces dentalis 0.905 0.905 1144.392 13492.889 11.781 2.307
Haemophilus influenzae 0.838 0.838 331.474 4434.133 13.340 2.231
Neisseria bacilliformis 0.770 0.770 189.531 3105.889 16.307 2.196
Actinomyces gerencseriae 0.932 0.932 692.951 6487.663 9.350 2.179
Cryptobacterium curtum 0.689 0.689 30.300 678.484 21.709 2.152
Actinomyces timonensis 0.892 0.892 352.945 3530.449 9.977 2.137
Fusobacterium_C necrophorum 0.716 0.716 175.005 2977.364 16.922 2.067
Filifactor alocis 0.824 0.824 187.694 2113.744 11.207 2.062
Haemophilus influenzae_F 0.892 0.892 569.899 5117.722 8.966 2.051
Selenomonas sputigena 0.784 0.784 169.138 2123.697 12.488 2.039
In [3]:
# Prevalence vs activity scatter
fig, ax = plt.subplots(figsize=(10, 8))
# Color by whether species has isolates in PROTECT
bridge_species = set(bridge.isolate_taxon.unique())
has_isolate = ecology.index.isin(bridge_species)
ax.scatter(ecology.loc[~has_isolate, 'prevalence_metag'],
ecology.loc[~has_isolate, 'activity_ratio'],
alpha=0.4, s=20, c='grey', label='No isolate')
ax.scatter(ecology.loc[has_isolate, 'prevalence_metag'],
ecology.loc[has_isolate, 'activity_ratio'],
alpha=0.7, s=40, c='steelblue', label='Has PROTECT isolate')
# Label top engraftable species with isolates
top_engraft = ecology[has_isolate].nlargest(10, 'engraftability')
for sp in top_engraft.index:
ax.annotate(sp[:25], (ecology.loc[sp, 'prevalence_metag'], ecology.loc[sp, 'activity_ratio']),
fontsize=7, alpha=0.8)
ax.set_xlabel('Prevalence (fraction of metaG samples)')
ax.set_ylabel('Activity Ratio (metaRS CPM / metaG CPM)')
ax.set_title('Species Prevalence vs Transcriptional Activity')
ax.set_yscale('log')
ax.legend()
plt.tight_layout()
plt.savefig(FIGS / '04_prevalence_vs_activity.png', dpi=150, bbox_inches='tight')
plt.show()
2. FDA Safety Filter¶
In [4]:
# Flag unsafe species
unsafe_keywords = ['Pseudomonas aeruginosa', 'Klebsiella', 'Acinetobacter baumannii',
'Serratia', 'Citrobacter', 'Enterobacter', 'Enterococcus faecalis',
'Stenotrophomonas', 'Burkholderia', 'Staphylococcus aureus',
'Escherichia coli', 'Bacillus cereus', 'Bacillus_A cereus',
'Achromobacter', 'Ralstonia']
ecology['is_safe'] = ~ecology.index.str.contains('|'.join(unsafe_keywords), case=False, na=False)
safe_engraftable = ecology[ecology.is_safe & (ecology.prevalence_metag > 0.3)]
print(f'Safe species with >30% prevalence: {len(safe_engraftable)}')
print(safe_engraftable.nlargest(20, 'engraftability').round(3).to_string())
Safe species with >30% prevalence: 129
prevalence_metag prevalence_metars mean_cpm_metag mean_cpm_metars activity_ratio engraftability is_safe
species
Moraxella nonliquefaciens 0.743 0.743 33.505 5335.421 154.658 3.752 True
Propionibacterium acidifaciens 0.851 0.851 71.755 4435.920 60.985 3.513 True
Catonella morbi 0.784 0.784 202.738 17677.401 86.770 3.507 True
Fusobacterium vincentii 0.743 0.743 1362.724 137080.570 100.520 3.434 True
Alloprevotella sp900095835 0.770 0.770 200.120 8880.138 44.158 2.935 True
Haemophilus_A sputorum 0.919 0.919 266.315 6087.589 22.777 2.912 True
Corynebacterium pseudodiphtheriticum 0.905 0.905 172.675 3189.286 18.369 2.683 True
Actinomyces sp000220835 0.878 0.878 802.105 15660.080 19.501 2.653 True
Mycobacterium abscessus 0.824 0.824 271.959 5620.054 20.593 2.533 True
Actinomyces johnsonii 0.959 0.959 4195.893 54119.828 12.895 2.525 True
Actinomyces dentalis 0.905 0.905 1144.392 13492.889 11.781 2.307 True
Haemophilus influenzae 0.838 0.838 331.474 4434.133 13.340 2.231 True
Neisseria bacilliformis 0.770 0.770 189.531 3105.889 16.307 2.196 True
Actinomyces gerencseriae 0.932 0.932 692.951 6487.663 9.350 2.179 True
Cryptobacterium curtum 0.689 0.689 30.300 678.484 21.709 2.152 True
Actinomyces timonensis 0.892 0.892 352.945 3530.449 9.977 2.137 True
Fusobacterium_C necrophorum 0.716 0.716 175.005 2977.364 16.922 2.067 True
Filifactor alocis 0.824 0.824 187.694 2113.744 11.207 2.062 True
Haemophilus influenzae_F 0.892 0.892 569.899 5117.722 8.966 2.051 True
Selenomonas sputigena 0.784 0.784 169.138 2123.697 12.488 2.039 True
3. PA Competitors from Metagenomics¶
In [5]:
# Metagenomics-derived PA competition scores
pa_comp['competition_score'] = pd.to_numeric(pa_comp['competition_score'], errors='coerce')
pa_mean = pa_comp.groupby('competitor')['competition_score'].agg(['mean','std','count']).sort_values('mean', ascending=False)
pa_mean.columns = ['mean_competition', 'std_competition', 'n_samples']
print('Top metagenomics-derived PA competitors:')
print(pa_mean.head(20).round(3).to_string())
# Merge with engraftability
pa_mean_eco = pa_mean.merge(ecology, left_index=True, right_index=True, how='inner')
print(f'\nPA competitors with ecology data: {len(pa_mean_eco)}')
Top metagenomics-derived PA competitors:
mean_competition std_competition n_samples
competitor
Achromobacter xylosoxidans 3.124 1.094 2
Actinomyces dentalis 2.993 1.077 3
Arachnia propionica 2.774 0.864 3
Prevotella melaninogenica 2.367 0.833 14
Neisseria elongata 2.288 NaN 1
Haemophilus influenzae_F 2.174 0.354 2
Streptococcus thermophilus 1.919 0.445 7
Rothia dentocariosa 1.906 0.418 12
Prevotella scopos 1.901 0.666 4
Staphylococcus aureus 1.898 0.750 2
Prevotella histicola 1.887 0.011 2
Haemophilus influenzae 1.878 0.426 2
Prevotella denticola 1.877 0.974 2
Rothia sp001808955 1.864 0.475 15
Streptococcus salivarius 1.836 0.464 9
Streptococcus sanguinis_H 1.768 0.584 6
Bifidobacterium breve 1.709 0.424 2
Neisseria sp000186165 1.705 NaN 1
Prevotella nanceiensis 1.683 0.698 2
Streptococcus parasanguinis 1.564 0.346 10
PA competitors with ecology data: 40
In [6]:
# Bridge: link isolate species to metagenomics features
print('=== Isolate-Metagenomics Bridge ===')
print(bridge.to_string())
print(f'\nMatch levels: {bridge.match_level.value_counts().to_dict()}')
print(f'Match types: {bridge.match_type.value_counts().to_dict()}')
=== Isolate-Metagenomics Bridge ===
isolate_taxon metagenomic_feature match_level match_type isolate_count
0 Streptococcus sp001553685 Streptococcus genus genus_only 10
1 Stenotrophomonas maltophilia Stenotrophomonas maltophilia species exact 16
2 Streptococcus oralis_BH Streptococcus genus genus_only 1
3 Streptococcus sp024295625 Streptococcus genus genus_only 1
4 Pauljensenia Pauljensenia genus genus_aggregate 2
5 Streptococcus sanguinis Streptococcus genus genus_only 92
6 Lacticaseibacillus paracasei Lacticaseibacillus genus genus_only 32
7 Streptococcus mitis_CU Streptococcus genus genus_only 2
8 Abiotrophia Abiotrophia genus genus_aggregate 1
9 Streptococcus sp000296995 Streptococcus genus genus_only 7
10 Neisseria flavescens Neisseria genus genus_only 1
11 Streptococcus mutans Streptococcus genus genus_only 6
12 Streptococcus mitis_CC Streptococcus genus genus_only 1
13 Streptococcus sp001813295 Streptococcus genus genus_only 1
14 Staphylococcus Staphylococcus genus genus_aggregate 572
15 Streptococcus oralis_W Streptococcus genus genus_only 2
16 Granulicatella sp900551535 Granulicatella genus genus_only 2
17 Achromobacter xylosoxidans Achromobacter xylosoxidans species exact 276
18 Gemella haemolysans_D Gemella genus genus_only 3
19 Neisseria sicca Neisseria genus genus_only 5
20 Streptococcus oralis_AC Streptococcus genus genus_only 3
21 Staphylococcus Staphylococcus genus genus_only 4
22 Rothia aeria Rothia aeria species exact 33
23 Streptococcus sp001556435 Streptococcus genus genus_only 135
24 Streptococcus vaginalis Streptococcus genus genus_only 1
25 Gemella sp900766305 Gemella genus genus_only 5
26 Streptococcus constellatus Streptococcus genus genus_only 2
27 Granulicatella sp001058355 Granulicatella genus genus_only 5
28 Staphylococcus hominis Staphylococcus genus genus_only 1
29 Streptococcus mitis_DA Streptococcus genus genus_only 9
30 Actinomyces oris Actinomyces genus genus_only 34
31 Streptococcus sp001811305 Streptococcus genus genus_only 17
32 Neisseria bacilliformis Neisseria bacilliformis species exact 2
33 Stenotrophomonas maltophilia_Q Stenotrophomonas genus genus_only 1
34 Streptococcus oralis_BC Streptococcus genus genus_only 1
35 Streptococcus parasanguinis_A Streptococcus genus genus_only 1
36 Streptococcus infantis_B Streptococcus genus genus_only 3
37 Moraxella catarrhalis Moraxella genus genus_only 7
38 Streptococcus sp001072385 Streptococcus genus genus_only 10
39 Rothia mucilaginosa_A Rothia genus genus_only 11
40 Streptococcus sp000688775 Streptococcus genus genus_only 3
41 Gemella sanguinis Gemella sanguinis species exact 34
42 Streptococcus agalactiae Streptococcus genus genus_only 4
43 Streptococcus sanguinis_M Streptococcus genus genus_only 3
44 Streptococcus infantis_M Streptococcus genus genus_only 4
45 Streptococcus Streptococcus genus genus_aggregate 1400
46 Streptococcus mitis Streptococcus genus genus_only 9
47 Streptococcus sanguinis_Q Streptococcus genus genus_only 2
48 Neisseria subflava Neisseria genus genus_only 4
49 Neisseria sp946902045 Neisseria genus genus_only 1
50 Streptococcus parasanguinis Streptococcus parasanguinis species exact 45
51 Granulicatella sp015264885 Granulicatella genus genus_only 2
52 Streptococcus mitis_CI Streptococcus genus genus_only 1
53 Granulicatella sp916049935 Granulicatella genus genus_only 11
54 Streptococcus oralis_BZ Streptococcus genus genus_only 1
55 Corynebacterium argentoratense Corynebacterium genus genus_only 11
56 Moraxella Moraxella genus genus_aggregate 7
57 Neisseria perflava_A Neisseria genus genus_only 30
58 Streptococcus sp902489265 Streptococcus genus genus_only 1
59 Streptococcus parasanguinis_C Streptococcus genus genus_only 4
60 Streptococcus pseudopneumoniae_N Streptococcus genus genus_only 1
61 Streptococcus sp013277205 Streptococcus genus genus_only 2
62 Streptococcus xiaochunlingii Streptococcus genus genus_only 24
63 Streptococcus sp030546305 Streptococcus genus genus_only 6
64 Streptococcus vestibularis Streptococcus genus genus_only 8
65 Streptococcus sanguinis_P Streptococcus genus genus_only 2
66 Streptococcus mitis_BM Streptococcus genus genus_only 2
67 Neisseria sp000952795 Neisseria genus genus_only 19
68 Streptococcus oralis_S Streptococcus genus genus_only 24
69 Granulicatella Granulicatella genus genus_only 3
70 Streptococcus sanguinis_G Streptococcus genus genus_only 8
71 Streptococcus mitis_CA Streptococcus genus genus_only 3
72 Neisseria sicca_B Neisseria genus genus_only 3
73 Streptococcus infantis_H Streptococcus genus genus_only 7
74 Streptococcus thermophilus Streptococcus thermophilus species exact 1
75 Corynebacterium Corynebacterium genus genus_aggregate 24
76 Streptococcus oralis Streptococcus genus genus_only 2
77 Streptococcus sp902363395 Streptococcus genus genus_only 15
78 Actinomyces oris_E Actinomyces genus genus_only 1
79 Streptococcus sanguinis_N Streptococcus genus genus_only 4
80 Gemella haemolysans_B Gemella genus genus_only 3
81 Streptococcus salivarius_D Streptococcus genus genus_only 150
82 Streptococcus pseudopneumoniae_L Streptococcus genus genus_only 3
83 Streptococcus sp900766505 Streptococcus genus genus_only 17
84 Streptococcus parasanguinis_F Streptococcus genus genus_only 7
85 Neisseria mucosa Neisseria mucosa species exact 22
86 Streptococcus oralis_BY Streptococcus genus genus_only 15
87 Streptococcus sp000187745 Streptococcus genus genus_only 2
88 Neisseria sp001809325 Neisseria genus genus_only 13
89 Staphylococcus aureus Staphylococcus aureus species exact 379
90 Streptococcus parasanguinis_L Streptococcus genus genus_only 3
91 Streptococcus oralis_BW Streptococcus genus genus_only 3
92 Neisseria perflava Neisseria genus genus_only 53
93 Streptococcus oralis_I Streptococcus genus genus_only 1
94 Rothia sp029850875 Rothia genus genus_only 11
95 Streptococcus symci Streptococcus genus genus_only 1
96 Streptococcus mitis_Q Streptococcus genus genus_only 1
97 Gemella sp927911745 Gemella genus genus_only 1
98 Actinomyces sp915069725 Actinomyces genus genus_only 34
99 Streptococcus dentisani Streptococcus genus genus_only 5
100 Streptococcus anginosus Streptococcus anginosus species exact 5
101 Pauljensenia sp902373545 Pauljensenia genus genus_only 2
102 Streptococcus sp002238115 Streptococcus genus genus_only 1
103 Streptococcus Streptococcus genus genus_only 141
104 Streptococcus oralis_BK Streptococcus genus genus_only 5
105 Gemella Gemella genus genus_aggregate 52
106 Abiotrophia sp001815865 Abiotrophia genus genus_only 1
107 Streptococcus parasanguinis_D Streptococcus genus genus_only 4
108 Streptococcus hominis Streptococcus genus genus_only 15
109 Streptococcus cristatus Streptococcus genus genus_only 1
110 Streptococcus sp013277555 Streptococcus genus genus_only 2
111 Streptococcus sp000187445 Streptococcus genus genus_only 3
112 Streptococcus infantis_I Streptococcus genus genus_only 2
113 Neisseria sicca_A Neisseria genus genus_only 4
114 Streptococcus mitis_AC Streptococcus genus genus_only 2
115 Stenotrophomonas maltophilia_P Stenotrophomonas genus genus_only 4
116 Streptococcus parasanguinis_Q Streptococcus genus genus_only 3
117 Streptococcus sanguinis_C Streptococcus genus genus_only 4
118 Streptococcus caecimuris Streptococcus genus genus_only 13
119 Staphylococcus haemolyticus Staphylococcus haemolyticus species exact 47
120 Neisseria sp000227275 Neisseria genus genus_only 2
121 Neisseria sp915066515 Neisseria genus genus_only 32
122 Lacticaseibacillus Lacticaseibacillus genus genus_aggregate 98
123 Streptococcus oralis_BV Streptococcus genus genus_only 5
124 Streptococcus oralis_BX Streptococcus genus genus_only 3
125 Streptococcus oralis_BD Streptococcus genus genus_only 3
126 Actinomyces oris_A Actinomyces oris_A species exact 23
127 Rothia Rothia genus genus_aggregate 453
128 Streptococcus parasanguinis_K Streptococcus genus genus_only 1
129 Actinomyces naeslundii Actinomyces genus genus_only 12
130 Streptococcus oralis_BA Streptococcus genus genus_only 2
131 Lacticaseibacillus rhamnosus Lacticaseibacillus rhamnosus species exact 66
132 Streptococcus sp902479835 Streptococcus genus genus_only 1
133 Streptococcus sp001813105 Streptococcus genus genus_only 17
134 Pseudomonas Pseudomonas genus genus_aggregate 655
135 Neisseria flavescens_D Neisseria genus genus_only 3
136 Streptococcus peroris Streptococcus genus genus_only 2
137 Streptococcus mitis_BP Streptococcus genus genus_only 2
138 Neisseria cinerea Neisseria genus genus_only 1
139 Neisseria mucosa_A Neisseria genus genus_only 6
140 Streptococcus sp030825805 Streptococcus genus genus_only 22
141 Streptococcus parasanguinis_I Streptococcus genus genus_only 9
142 Rothia sp024294905 Rothia genus genus_only 9
143 Streptococcus viridans_A Streptococcus genus genus_only 1
144 Rothia Rothia genus genus_only 1
145 Streptococcus gordonii Streptococcus gordonii species exact 40
146 Rothia sp001836735 Rothia genus genus_only 7
147 Achromobacter Achromobacter genus genus_aggregate 276
148 Rothia mucilaginosa Rothia genus genus_only 38
149 Actinomyces Actinomyces genus genus_aggregate 104
150 Neisseria oralis Neisseria genus genus_only 1
151 Neisseria subflava_A Neisseria genus genus_only 36
152 Streptococcus sanguinis_K Streptococcus genus genus_only 8
153 Gemella Gemella genus genus_only 6
154 Neisseria elongata Neisseria elongata species exact 18
155 Neisseria flavescens_B Neisseria genus genus_only 63
156 Streptococcus oralis_CA Streptococcus genus genus_only 14
157 Streptococcus sanguinis_R Streptococcus genus genus_only 20
158 Streptococcus mitis_BZ Streptococcus genus genus_only 6
159 Neisseria sp001815675 Neisseria genus genus_only 16
160 Granulicatella Granulicatella genus genus_aggregate 29
161 Streptococcus timonensis Streptococcus genus genus_only 1
162 Streptococcus cristatus_I Streptococcus genus genus_only 2
163 Neisseria Neisseria genus genus_only 63
164 Stenotrophomonas Stenotrophomonas genus genus_aggregate 21
165 Streptococcus sp943736975 Streptococcus genus genus_only 1
166 Pseudomonas aeruginosa Pseudomonas aeruginosa species exact 655
167 Streptococcus pseudopneumoniae_M Streptococcus genus genus_only 1
168 Staphylococcus capitis Staphylococcus genus genus_only 24
169 Streptococcus sobrinus Streptococcus genus genus_only 5
170 Streptococcus sanguinis_H Streptococcus sanguinis_H species exact 42
171 Streptococcus intermedius Streptococcus intermedius species exact 1
172 Neisseria Neisseria genus genus_aggregate 494
173 Neisseria sp000186165 Neisseria sp000186165 species exact 76
174 Neisseria macacae Neisseria genus genus_only 7
175 Streptococcus oralis_CO Streptococcus genus genus_only 2
176 Staphylococcus epidermidis Staphylococcus genus genus_only 117
177 Streptococcus salivarius Streptococcus salivarius species exact 276
178 Streptococcus oralis_CF Streptococcus genus genus_only 1
179 Corynebacterium durum Corynebacterium durum species exact 13
180 Neisseria flava Neisseria genus genus_only 13
181 Streptococcus vulneris Streptococcus genus genus_only 11
182 Streptococcus mitis_BA Streptococcus genus genus_only 1
183 Rothia sp001808955 Rothia sp001808955 species exact 25
184 Rothia dentocariosa Rothia dentocariosa species exact 318
185 Granulicatella adiacens Granulicatella adiacens species exact 6
Match levels: {'genus': 162, 'species': 24}
Match types: {'genus_only': 147, 'exact': 24, 'genus_aggregate': 15}
In [7]:
# Save engraftability scores
ecology.to_csv(DATA / 'species_engraftability.tsv', sep='\t')
print(f'Saved: {DATA}/species_engraftability.tsv ({len(ecology)} species)')
print(f'\n=== NB04 SUMMARY ===')
print(f'Species in metagenomics: {len(ecology)}')
print(f'Species with isolates in PROTECT: {has_isolate.sum()}')
print(f'Safe species (>30% prevalence): {len(safe_engraftable)}')
print(f'PA competitors from metagenomics: {len(pa_mean)}')
Saved: ../data/species_engraftability.tsv (134 species) === NB04 SUMMARY === Species in metagenomics: 134 Species with isolates in PROTECT: 24 Safe species (>30% prevalence): 129 PA competitors from metagenomics: 40