02 Refined Concordance Analysis
Jupyter notebook from the ADP1 Triple Essentiality Concordance project.
Refined Concordance Analysis¶
Comparing essentiality predictions from FBA, RB-TnSeq, and proteomics against KO experimental truth.
This notebook:
- Loads essentiality vectors from notebook 01
- Computes concordance metrics (confusion matrices, Cohen's kappa, F1 scores)
- Performs ROC curve analysis for continuous predictors
- Characterizes discordant genes
- Generates comprehensive visualizations
In [ ]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import (
confusion_matrix, accuracy_score, precision_score, recall_score,
f1_score, cohen_kappa_score, roc_curve, auc
)
from scipy.stats import pearsonr, spearmanr, mannwhitneyu
import warnings
warnings.filterwarnings('ignore')
sns.set_style('whitegrid')
ev = pd.read_csv('../data/essentiality_vectors.csv')
print(f'Loaded {len(ev):,} genes')
ev.head()
In [ ]:
# Filter for genes with both FBA and KO data (rich media)
fba_ko_rich = ev[ev['fba_rich_essential'].notna() & ev['ko_rich_essential'].notna()].copy()
y_true = fba_ko_rich['ko_rich_essential'].astype(int)
y_pred = fba_ko_rich['fba_rich_essential'].astype(int)
# Compute metrics
cm = confusion_matrix(y_true, y_pred)
recall = recall_score(y_true, y_pred)
precision = precision_score(y_true, y_pred)
f1 = f1_score(y_true, y_pred)
kappa = cohen_kappa_score(y_true, y_pred)
print(f"FBA Rich Media vs KO (n={len(fba_ko_rich):,} genes)")
print(f"Confusion Matrix:\n{cm}")
print(f"\nRecall (Sensitivity): {recall:.3f}")
print(f"Precision (PPV): {precision:.3f}")
print(f"F1 Score: {f1:.3f}")
print(f"Cohen's Kappa: {kappa:.3f}")
Set 2: Minimal Media (FBA vs KO)¶
In [ ]:
# Filter for genes with both FBA and KO data (minimal media)
fba_ko_min = ev[ev['fba_min_essential'].notna() & ev['ko_min_essential'].notna()].copy()
y_true = fba_ko_min['ko_min_essential'].astype(int)
y_pred = fba_ko_min['fba_min_essential'].astype(int)
# Compute metrics
cm = confusion_matrix(y_true, y_pred)
recall = recall_score(y_true, y_pred)
precision = precision_score(y_true, y_pred)
f1 = f1_score(y_true, y_pred)
kappa = cohen_kappa_score(y_true, y_pred)
print(f"FBA Minimal Media vs KO (n={len(fba_ko_min):,} genes)")
print(f"Confusion Matrix:\n{cm}")
print(f"\nRecall (Sensitivity): {recall:.3f}")
print(f"Precision (PPV): {precision:.3f}")
print(f"F1 Score: {f1:.3f}")
print(f"Cohen's Kappa: {kappa:.3f}")
2. TnSeq Concordance Analysis (Multiple Thresholds)¶
Test 5 thresholds for essentiality_fraction and compute concordance with KO rich media
In [ ]:
# Filter for genes with both TnSeq and KO data (rich media)
tnseq_ko_rich = ev[ev['essentiality_fraction'].notna() & ev['ko_rich_essential'].notna()].copy()
thresholds = [0.01, 0.025, 0.05, 0.10, 0.20]
results = []
for threshold in thresholds:
tnseq_ko_rich['tnseq_binary'] = (tnseq_ko_rich['essentiality_fraction'] >= threshold).astype(int)
y_true = tnseq_ko_rich['ko_rich_essential'].astype(int)
y_pred = tnseq_ko_rich['tnseq_binary']
cm = confusion_matrix(y_true, y_pred)
recall = recall_score(y_true, y_pred)
precision = precision_score(y_true, y_pred, zero_division=0)
f1 = f1_score(y_true, y_pred, zero_division=0)
kappa = cohen_kappa_score(y_true, y_pred)
results.append({
'threshold': threshold,
'n': len(tnseq_ko_rich),
'recall': recall,
'precision': precision,
'f1': f1,
'kappa': kappa,
'cm': cm
})
# Create summary dataframe
tnseq_summary = pd.DataFrame(results)
print("\nTnSeq Threshold Analysis (Rich Media vs KO)")
print(tnseq_summary[['threshold', 'n', 'recall', 'precision', 'f1', 'kappa']].to_string(index=False))
# Save to CSV
tnseq_summary[['threshold', 'n', 'recall', 'precision', 'f1', 'kappa']].to_csv(
'../data/tnseq_threshold_comparison.csv', index=False
)
print("\nSaved to: ../data/tnseq_threshold_comparison.csv")
3. Proteomics Correlation Analysis¶
In [ ]:
# Filter for genes with both proteomics and KO data (minimal media)
prot_ko = ev[ev['proteomics_avg_log2'].notna() & ev['ko_min_essential'].notna()].copy()
# Separate essential and dispensable
essential = prot_ko[prot_ko['ko_min_essential'] == 1]['proteomics_avg_log2']
dispensable = prot_ko[prot_ko['ko_min_essential'] == 0]['proteomics_avg_log2']
# Compute statistics
pearson_r, pearson_p = pearsonr(prot_ko['ko_min_essential'], prot_ko['proteomics_avg_log2'])
spearman_r, spearman_p = spearmanr(prot_ko['ko_min_essential'], prot_ko['proteomics_avg_log2'])
mw_stat, mw_p = mannwhitneyu(essential, dispensable, alternative='greater')
print(f"Proteomics vs Essentiality (Minimal Media, n={len(prot_ko):,} genes)")
print(f"\nEssential genes (n={len(essential):,}):")
print(f" Mean log2: {essential.mean():.2f} ± {essential.std():.2f}")
print(f"\nDispensable genes (n={len(dispensable):,}):")
print(f" Mean log2: {dispensable.mean():.2f} ± {dispensable.std():.2f}")
print(f"\nDifference: {essential.mean() - dispensable.mean():.2f} log2 units")
print(f"Fold change: {2**(essential.mean() - dispensable.mean()):.1f}x")
print(f"\nPearson r: {pearson_r:.3f} (p={pearson_p:.2e})")
print(f"Spearman ρ: {spearman_r:.3f} (p={spearman_p:.2e})")
print(f"Mann-Whitney U: p={mw_p:.2e}")
# Save results
prot_results = pd.DataFrame([{
'n_total': len(prot_ko),
'n_essential': len(essential),
'n_dispensable': len(dispensable),
'essential_mean': essential.mean(),
'essential_std': essential.std(),
'dispensable_mean': dispensable.mean(),
'dispensable_std': dispensable.std(),
'log2_diff': essential.mean() - dispensable.mean(),
'fold_change': 2**(essential.mean() - dispensable.mean()),
'pearson_r': pearson_r,
'pearson_p': pearson_p,
'spearman_r': spearman_r,
'spearman_p': spearman_p,
'mannwhitney_p': mw_p
}])
prot_results.to_csv('../data/proteomics_correlation.csv', index=False)
print("\nSaved to: ../data/proteomics_correlation.csv")
4. ROC Curve Analysis¶
Evaluate continuous predictors: fitness, essentiality_fraction, proteomics
In [ ]:
# Prepare data for ROC analysis
roc_results = []
# Rich media: Fitness vs KO
fitness_rich = ev[ev['fitness_mean'].notna() & ev['ko_rich_essential'].notna()].copy()
if len(fitness_rich) > 0:
y_true = fitness_rich['ko_rich_essential'].astype(int)
# Invert fitness: lower fitness = more essential
y_score = -fitness_rich['fitness_mean']
fpr, tpr, _ = roc_curve(y_true, y_score)
auc_score = auc(fpr, tpr)
roc_results.append({
'set': 'Rich Media',
'predictor': 'Fitness (inverted)',
'n': len(fitness_rich),
'auc': auc_score
})
print(f"Fitness (inverted) vs KO Rich: AUC = {auc_score:.3f} (n={len(fitness_rich):,})")
# Rich media: Essentiality fraction vs KO
ef_rich = ev[ev['essentiality_fraction'].notna() & ev['ko_rich_essential'].notna()].copy()
if len(ef_rich) > 0:
y_true = ef_rich['ko_rich_essential'].astype(int)
y_score = ef_rich['essentiality_fraction']
fpr, tpr, _ = roc_curve(y_true, y_score)
auc_score = auc(fpr, tpr)
roc_results.append({
'set': 'Rich Media',
'predictor': 'Essentiality Fraction',
'n': len(ef_rich),
'auc': auc_score
})
print(f"Essentiality Fraction vs KO Rich: AUC = {auc_score:.3f} (n={len(ef_rich):,})")
# Minimal media: Fitness vs KO
fitness_min = ev[ev['fitness_mean'].notna() & ev['ko_min_essential'].notna()].copy()
if len(fitness_min) > 0:
y_true = fitness_min['ko_min_essential'].astype(int)
y_score = -fitness_min['fitness_mean']
fpr, tpr, _ = roc_curve(y_true, y_score)
auc_score = auc(fpr, tpr)
roc_results.append({
'set': 'Minimal Media',
'predictor': 'Fitness (inverted)',
'n': len(fitness_min),
'auc': auc_score
})
print(f"Fitness (inverted) vs KO Min: AUC = {auc_score:.3f} (n={len(fitness_min):,})")
# Minimal media: Essentiality fraction vs KO
ef_min = ev[ev['essentiality_fraction'].notna() & ev['ko_min_essential'].notna()].copy()
if len(ef_min) > 0:
y_true = ef_min['ko_min_essential'].astype(int)
y_score = ef_min['essentiality_fraction']
fpr, tpr, _ = roc_curve(y_true, y_score)
auc_score = auc(fpr, tpr)
roc_results.append({
'set': 'Minimal Media',
'predictor': 'Essentiality Fraction',
'n': len(ef_min),
'auc': auc_score
})
print(f"Essentiality Fraction vs KO Min: AUC = {auc_score:.3f} (n={len(ef_min):,})")
# Minimal media: Proteomics vs KO
prot_min = ev[ev['proteomics_avg_log2'].notna() & ev['ko_min_essential'].notna()].copy()
if len(prot_min) > 0:
y_true = prot_min['ko_min_essential'].astype(int)
y_score = prot_min['proteomics_avg_log2']
fpr, tpr, _ = roc_curve(y_true, y_score)
auc_score = auc(fpr, tpr)
roc_results.append({
'set': 'Minimal Media',
'predictor': 'Proteomics (log2)',
'n': len(prot_min),
'auc': auc_score
})
print(f"Proteomics vs KO Min: AUC = {auc_score:.3f} (n={len(prot_min):,})")
# Save ROC results
roc_df = pd.DataFrame(roc_results)
roc_df.to_csv('../data/roc_summary.csv', index=False)
print("\nSaved to: ../data/roc_summary.csv")
print("\nROC Summary:")
print(roc_df.to_string(index=False))
5. Comprehensive Concordance Summary¶
Combine FBA and TnSeq concordance metrics
In [ ]:
# Create comprehensive concordance summary
concordance_data = []
# FBA Rich
fba_ko_rich = ev[ev['fba_rich_essential'].notna() & ev['ko_rich_essential'].notna()].copy()
y_true = fba_ko_rich['ko_rich_essential'].astype(int)
y_pred = fba_ko_rich['fba_rich_essential'].astype(int)
concordance_data.append({
'Set': 'Rich Media',
'Source': 'FBA',
'N': len(fba_ko_rich),
'Recall': recall_score(y_true, y_pred),
'Precision': precision_score(y_true, y_pred),
'F1': f1_score(y_true, y_pred),
'Kappa': cohen_kappa_score(y_true, y_pred)
})
# TnSeq Rich (all thresholds)
for threshold in thresholds:
tnseq_ko_rich = ev[ev['essentiality_fraction'].notna() & ev['ko_rich_essential'].notna()].copy()
tnseq_ko_rich['tnseq_binary'] = (tnseq_ko_rich['essentiality_fraction'] >= threshold).astype(int)
y_true = tnseq_ko_rich['ko_rich_essential'].astype(int)
y_pred = tnseq_ko_rich['tnseq_binary']
concordance_data.append({
'Set': 'Rich Media',
'Source': f'TnSeq ({threshold})',
'N': len(tnseq_ko_rich),
'Recall': recall_score(y_true, y_pred),
'Precision': precision_score(y_true, y_pred, zero_division=0),
'F1': f1_score(y_true, y_pred, zero_division=0),
'Kappa': cohen_kappa_score(y_true, y_pred)
})
# FBA Minimal
fba_ko_min = ev[ev['fba_min_essential'].notna() & ev['ko_min_essential'].notna()].copy()
y_true = fba_ko_min['ko_min_essential'].astype(int)
y_pred = fba_ko_min['fba_min_essential'].astype(int)
concordance_data.append({
'Set': 'Minimal Media',
'Source': 'FBA',
'N': len(fba_ko_min),
'Recall': recall_score(y_true, y_pred),
'Precision': precision_score(y_true, y_pred),
'F1': f1_score(y_true, y_pred),
'Kappa': cohen_kappa_score(y_true, y_pred)
})
concordance_df = pd.DataFrame(concordance_data)
concordance_df.to_csv('../data/concordance_summary.csv', index=False)
print("Comprehensive Concordance Summary:")
print(concordance_df.to_string(index=False))
print("\nSaved to: ../data/concordance_summary.csv")
6. Discordant Gene Characterization¶
Analyze genes where TnSeq and KO disagree (using optimal threshold 0.05)
In [ ]:
# Use threshold 0.05 for detailed discordance analysis
threshold = 0.05
tnseq_ko = ev[ev['essentiality_fraction'].notna() & ev['ko_rich_essential'].notna()].copy()
tnseq_ko['tnseq_essential'] = (tnseq_ko['essentiality_fraction'] >= threshold).astype(int)
tnseq_ko['ko_essential'] = tnseq_ko['ko_rich_essential'].astype(int)
# Classify concordance
def classify_concordance(row):
if row['ko_essential'] == 1 and row['tnseq_essential'] == 1:
return 'Both Essential'
elif row['ko_essential'] == 0 and row['tnseq_essential'] == 0:
return 'Both Dispensable'
elif row['ko_essential'] == 1 and row['tnseq_essential'] == 0:
return 'KO Essential, TnSeq Dispensable'
else:
return 'KO Dispensable, TnSeq Essential'
tnseq_ko['concordance_class'] = tnseq_ko.apply(classify_concordance, axis=1)
# Summary by class
discord_summary = tnseq_ko.groupby('concordance_class').agg({
'feature_id': 'count',
'essentiality_fraction': 'mean',
'fitness_mean': 'mean'
}).rename(columns={'feature_id': 'count'})
print("Discordance Summary (threshold=0.05):")
print(discord_summary)
print(f"\nTotal genes: {len(tnseq_ko):,}")
# Save discordance summary
discord_summary.to_csv('../data/discordance_summary.csv')
print("\nSaved to: ../data/discordance_summary.csv")
# Save discordant gene lists
ko_ess_tn_disp = tnseq_ko[tnseq_ko['concordance_class'] == 'KO Essential, TnSeq Dispensable']
ko_disp_tn_ess = tnseq_ko[tnseq_ko['concordance_class'] == 'KO Dispensable, TnSeq Essential']
ko_ess_tn_disp[['feature_id', 'gene_names', 'rast_function', 'essentiality_fraction', 'fitness_mean']].to_csv(
'../data/discordant_ko_essential_tnseq_dispensable.csv', index=False
)
ko_disp_tn_ess[['feature_id', 'gene_names', 'rast_function', 'essentiality_fraction', 'fitness_mean']].to_csv(
'../data/discordant_ko_dispensable_tnseq_essential.csv', index=False
)
print(f"Saved {len(ko_ess_tn_disp):,} KO essential/TnSeq dispensable genes")
print(f"Saved {len(ko_disp_tn_ess):,} KO dispensable/TnSeq essential genes")
7. Visualizations¶
Generate comprehensive figures for the report
In [ ]:
# Figure 1: FBA Comparison (Rich vs Minimal)
fig, axes = plt.subplots(1, 2, figsize=(12, 5))
# Rich media
fba_ko_rich = ev[ev['fba_rich_essential'].notna() & ev['ko_rich_essential'].notna()].copy()
cm_rich = confusion_matrix(fba_ko_rich['ko_rich_essential'], fba_ko_rich['fba_rich_essential'])
sns.heatmap(cm_rich, annot=True, fmt='d', cmap='Blues', ax=axes[0],
xticklabels=['Disp', 'Ess'], yticklabels=['Disp', 'Ess'])
axes[0].set_title(f'FBA Rich Media (n={len(fba_ko_rich):,})\nκ={cohen_kappa_score(fba_ko_rich["ko_rich_essential"], fba_ko_rich["fba_rich_essential"]):.3f}')
axes[0].set_xlabel('FBA Prediction')
axes[0].set_ylabel('KO Truth')
# Minimal media
fba_ko_min = ev[ev['fba_min_essential'].notna() & ev['ko_min_essential'].notna()].copy()
cm_min = confusion_matrix(fba_ko_min['ko_min_essential'], fba_ko_min['fba_min_essential'])
sns.heatmap(cm_min, annot=True, fmt='d', cmap='Greens', ax=axes[1],
xticklabels=['Disp', 'Ess'], yticklabels=['Disp', 'Ess'])
axes[1].set_title(f'FBA Minimal Media (n={len(fba_ko_min):,})\nκ={cohen_kappa_score(fba_ko_min["ko_min_essential"], fba_ko_min["fba_min_essential"]):.3f}')
axes[1].set_xlabel('FBA Prediction')
axes[1].set_ylabel('KO Truth')
plt.tight_layout()
plt.savefig('../figures/fba_comparison.png', dpi=300, bbox_inches='tight')
plt.show()
print("Saved: ../figures/fba_comparison.png")
In [ ]:
# Figure 2: ROC Curves
fig, axes = plt.subplots(1, 2, figsize=(14, 6))
# Rich media
ax = axes[0]
# Fitness
fitness_rich = ev[ev['fitness_mean'].notna() & ev['ko_rich_essential'].notna()].copy()
y_true = fitness_rich['ko_rich_essential'].astype(int)
y_score = -fitness_rich['fitness_mean']
fpr, tpr, _ = roc_curve(y_true, y_score)
auc_score = auc(fpr, tpr)
ax.plot(fpr, tpr, label=f'Fitness (AUC={auc_score:.3f})', linewidth=2)
# Essentiality fraction
ef_rich = ev[ev['essentiality_fraction'].notna() & ev['ko_rich_essential'].notna()].copy()
y_true = ef_rich['ko_rich_essential'].astype(int)
y_score = ef_rich['essentiality_fraction']
fpr, tpr, _ = roc_curve(y_true, y_score)
auc_score = auc(fpr, tpr)
ax.plot(fpr, tpr, label=f'Essentiality Fraction (AUC={auc_score:.3f})', linewidth=2)
ax.plot([0, 1], [0, 1], 'k--', label='Random (AUC=0.5)')
ax.set_xlabel('False Positive Rate')
ax.set_ylabel('True Positive Rate')
ax.set_title('ROC Curves: Rich Media')
ax.legend(loc='lower right')
ax.grid(True, alpha=0.3)
# Minimal media
ax = axes[1]
# Fitness
fitness_min = ev[ev['fitness_mean'].notna() & ev['ko_min_essential'].notna()].copy()
y_true = fitness_min['ko_min_essential'].astype(int)
y_score = -fitness_min['fitness_mean']
fpr, tpr, _ = roc_curve(y_true, y_score)
auc_score = auc(fpr, tpr)
ax.plot(fpr, tpr, label=f'Fitness (AUC={auc_score:.3f})', linewidth=2)
# Essentiality fraction
ef_min = ev[ev['essentiality_fraction'].notna() & ev['ko_min_essential'].notna()].copy()
y_true = ef_min['ko_min_essential'].astype(int)
y_score = ef_min['essentiality_fraction']
fpr, tpr, _ = roc_curve(y_true, y_score)
auc_score = auc(fpr, tpr)
ax.plot(fpr, tpr, label=f'Essentiality Fraction (AUC={auc_score:.3f})', linewidth=2)
# Proteomics
prot_min = ev[ev['proteomics_avg_log2'].notna() & ev['ko_min_essential'].notna()].copy()
y_true = prot_min['ko_min_essential'].astype(int)
y_score = prot_min['proteomics_avg_log2']
fpr, tpr, _ = roc_curve(y_true, y_score)
auc_score = auc(fpr, tpr)
ax.plot(fpr, tpr, label=f'Proteomics (AUC={auc_score:.3f})', linewidth=2)
ax.plot([0, 1], [0, 1], 'k--', label='Random (AUC=0.5)')
ax.set_xlabel('False Positive Rate')
ax.set_ylabel('True Positive Rate')
ax.set_title('ROC Curves: Minimal Media')
ax.legend(loc='lower right')
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('../figures/roc_comprehensive.png', dpi=300, bbox_inches='tight')
plt.show()
print("Saved: ../figures/roc_comprehensive.png")
In [ ]:
# Figure 3: Concordance Heatmap
fig, ax = plt.subplots(1, 1, figsize=(10, 8))
# Prepare concordance matrix
conc_pivot = concordance_df.pivot_table(
index='Source',
columns='Set',
values='Kappa'
)
sns.heatmap(conc_pivot, annot=True, fmt='.3f', cmap='RdYlGn', center=0,
vmin=-0.2, vmax=0.6, ax=ax, cbar_kws={'label': "Cohen's Kappa"})
ax.set_title("Concordance with KO Experiments (Cohen's Kappa)\nκ>0.4=Moderate, κ<0=Systematic Disagreement")
ax.set_xlabel('')
ax.set_ylabel('')
plt.tight_layout()
plt.savefig('../figures/concordance_comprehensive.png', dpi=300, bbox_inches='tight')
plt.show()
print("Saved: ../figures/concordance_comprehensive.png")
In [ ]:
# Figure 4: Discordance Analysis
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
# Concordance class counts
ax = axes[0, 0]
counts = tnseq_ko['concordance_class'].value_counts()
counts.plot(kind='barh', ax=ax, color=['green', 'lightgreen', 'orange', 'red'])
ax.set_xlabel('Number of Genes')
ax.set_title(f'Concordance Classification (n={len(tnseq_ko):,}, threshold=0.05)')
# Essentiality fraction by class
ax = axes[0, 1]
tnseq_ko.boxplot(column='essentiality_fraction', by='concordance_class', ax=ax)
ax.set_xlabel('')
ax.set_ylabel('Essentiality Fraction')
ax.set_title('Essentiality Fraction by Concordance Class')
plt.sca(ax)
plt.xticks(rotation=45, ha='right')
# Fitness by class
ax = axes[1, 0]
fitness_data = tnseq_ko[tnseq_ko['fitness_mean'].notna()]
fitness_data.boxplot(column='fitness_mean', by='concordance_class', ax=ax)
ax.set_xlabel('')
ax.set_ylabel('Fitness')
ax.set_title('Fitness by Concordance Class')
plt.sca(ax)
plt.xticks(rotation=45, ha='right')
# Scatter: Essentiality fraction vs Fitness
ax = axes[1, 1]
for cls, color in zip(['Both Essential', 'Both Dispensable', 'KO Essential, TnSeq Dispensable', 'KO Dispensable, TnSeq Essential'],
['green', 'lightgreen', 'orange', 'red']):
subset = tnseq_ko[tnseq_ko['concordance_class'] == cls]
subset = subset[subset['fitness_mean'].notna()]
ax.scatter(subset['essentiality_fraction'], subset['fitness_mean'],
label=cls, alpha=0.6, s=20, color=color)
ax.set_xlabel('Essentiality Fraction')
ax.set_ylabel('Fitness')
ax.set_title('Essentiality Fraction vs Fitness')
ax.legend(fontsize=8, loc='best')
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('../figures/discordance_analysis.png', dpi=300, bbox_inches='tight')
plt.show()
print("Saved: ../figures/discordance_analysis.png")
Summary¶
This notebook performed comprehensive concordance analysis:
Key Findings:
- FBA: Moderate concordance (κ≈0.49, F1≈0.62-0.67), better in minimal media
- TnSeq: Systematic discordance (κ<0 across all thresholds)
- Fitness: Best continuous predictor (AUC=0.70-0.73)
- Proteomics: Strong correlation with essentiality (AUC=0.74, 6.5-fold expression difference)
- Essentiality fraction: Performs worse than random (AUC<0.5)
Actionable Recommendations:
- Use continuous fitness scores, not binary essentiality_fraction
- FBA is useful for first-pass screening but requires experimental validation
- TnSeq and KO measure different biology (fitness vs lethality)
All results saved to:
../data/concordance_summary.csv../data/tnseq_threshold_comparison.csv../data/roc_summary.csv../data/proteomics_correlation.csv../data/discordance_summary.csv../data/discordant_*.csv
Figures saved to:
../figures/fba_comparison.png../figures/roc_comprehensive.png../figures/concordance_comprehensive.png../figures/discordance_analysis.png