The pipeline identifies rewired proteins — proteins that undergo significant changes in their interaction landscape without proportional changes in gene expression — across eight independent RNA-seq studies comparing Fusarium response to fungal and bacterial biological control agents (BCAs).
The Differential Interactome methodology (Q-value framework) used to quantify protein-protein interaction rewiring is adapted from:
Gulfidan, G., Turanli, B., Beklen, H. et al. (2020). "Pan-cancer mapping of differential protein-protein interactions." Scientific Reports 10, 3272. https://doi.org/10.1038/s41598-020-60127-x
Protein-protein interaction data: STRING v12.0 (combined score ≥ 400, taxid 229533 for F. graminearum).
scripts/
├── 00_setup.R # Shared configuration (sourced by all scripts)
├── 01_orthology_mapping.R # Cross-species gene ID mapping via KEGG KO
├── 02_differential_interactome.R # Core Q-value differential PPI algorithm
├── 03_rewired_analysis.R # Rewired protein identification and classification
├── 04_deg_volcano.R # Volcano plots with rewired protein overlay
├── 05_overlap_analysis.R # Venn/UpSet cross-study overlap analysis
├── 06_enrichment.R # KEGG + GO hypergeometric enrichment
└── 07_network_module.R # Network construction, Louvain modules, hub genes
Sourced at the top of every pipeline script. Defines all file paths, analysis thresholds, study metadata (8 studies × organism/BCA/condition), shared color palettes, and utility functions (save_fig(), ensure_dir()).
No inputs required. Sets BASE_DIR relative to the script location automatically.
- Input: KEGG KO assignment files (
fgr_ko.txt,fox_ko.txt,fpu_ko.txt) - Process: Maps F. oxysporum (FOXG) and F. pseudograminearum (FPSE) gene identifiers to F. graminearum (FGSG) IDs via shared KEGG Orthology groups. One-to-many mappings are resolved deterministically by selecting the lowest FGSG number.
- Output:
resources/fox_to_fgr_orthology.csv,resources/fpu_to_fgr_orthology.csv
- Input: featureCounts expression matrices (
counts/1.txt…counts/8.txt), STRING PPI network, orthology mappings from Step 01 - Process: For each STRING edge in each study:
- Binarizes gene expression per sample (1 if expression > mean − 1 SD, else 0)
- Counts co-expression in Control (NNC) and Treatment (NND) conditions
- Computes Q-value = NND / (NNC + NND)
- Classifies edges: Activated (Q ≥ 0.90), Repressed (Q ≤ 0.10)
- Applies minimum frequency filter: max(NNC, NND) ≥ Δ = 0.20
- Output: Per-study
protein_stats.csvandinteraction_stats.csvunderoutputs/differential_interactome/Study_N/
- Input:
protein_stats.csvfrom Step 02, DEG lists (degs/) - Process: A protein is classified as rewired if it meets both criteria:
- Differential degree (Ntotal) in the top 25% across the study
- Not a DEG: |log₂FC| ≤ 1.0 or Padj ≥ 0.05
- Output:
rewired_proteins.csvper study underoutputs/rewired_analysis/Study_N/; aggregatedoutputs/tables/rewired_summary.csv
- Input: DEG lists, rewired protein lists from Step 03
- Process: Generates volcano plots (log₂FC vs −log₁₀ Padj) for each of the 8 studies. Rewired proteins are overlaid as highlighted points; top-ranked rewired proteins are labeled.
- Output:
outputs/figures/volcano_*.png/.svg; panel figureFig1_volcano_panel_all_studies
- Input: Rewired protein lists from Step 03, DEG lists
- Process: Computes pairwise and multi-way overlaps. Generates four-way Venn diagrams (fungal BCAs; bacterial BCAs separately) and an UpSet plot for all 8 studies combined.
- Output:
outputs/figures/FigS2_venn_fungal_rewired,FigS3_venn_bacterial_rewired,Fig3_upset_rewired_proteins
- Input: Rewired protein lists from Step 03; KEGG KO file (
fgr_ko.txt); UniProt GO annotations (fgr_uniprot_go.tsv); STRING alias table for ID mapping - Process: Hypergeometric enrichment test for KEGG pathways and GO terms (MF/CC/BP). P-value computed as P(X ≥ k) =
phyper(k − 1, m, n, N, lower.tail = FALSE). FDR correction by Benjamini–Hochberg. Pathway names fetched live from the KEGG REST API (rest.kegg.jp). - Output: Enrichment tables and dot plots under
outputs/enrichment/andoutputs/figures/Fig6_kegg_*,Fig7_go_*
- Input: Rewired protein lists from Step 03, STRING PPI network
- Process:
- Builds aggregated PPI sub-networks for fungal and bacterial rewired proteins
- Applies Louvain community detection (
igraph::cluster_louvain(), seed = 42, default resolution = 1) - Annotates modules with functional labels and hub genes (highest intra-module degree)
- Saves node/edge/hub CSVs for downstream network visualization
- Output: Network CSVs under
outputs/networks/; module tableoutputs/tables/module_summary.csv; figuresFig4a/b,Fig5_module_traits_heatmap
Scripts are designed to be run sequentially. Each script sources 00_setup.R automatically.
# Run full pipeline in order
Rscript 01_orthology_mapping.R
Rscript 02_differential_interactome.R
Rscript 03_rewired_analysis.R
Rscript 04_deg_volcano.R # optional
Rscript 05_overlap_analysis.R # optional
Rscript 06_enrichment.R
Rscript 07_network_module.RNote: Scripts 04 and 05 are optional visualization scripts and do not affect downstream analysis.
project_root/
├── counts/
│ └── 1.txt … 8.txt # featureCounts output (tab-separated, with header)
├── degs/
│ └── {study_id}_{Up,Down}regulated_Genes.xlsx
├── metadata/
│ ├── study_metadata.csv # Sample-level metadata (study_id, condition, replicate)
│ └── study_table.csv # Study-level metadata (organism, BCA, type)
├── string_data/
│ ├── 229533.protein.links.v12.0.txt # F. graminearum STRING PPI
│ ├── 229533.protein.aliases.v12.0.txt # F. graminearum protein aliases
│ ├── 426428.protein.aliases.v12.0.txt # F. oxysporum protein aliases
│ ├── fgr_ko.txt # F. graminearum KEGG KO assignments
│ ├── fox_ko.txt # F. oxysporum KEGG KO assignments
│ ├── fpu_ko.txt # F. pseudograminearum KEGG KO assignments
│ └── fgr_uniprot_go.tsv # F. graminearum UniProt GO annotations
├── resources/ # Auto-generated by 01_orthology_mapping.R
└── scripts/ # This pipeline
install.packages(c(
"dplyr", "readr", "readxl", "openxlsx", "tidyr", "tibble",
"ggplot2", "ggrepel", "ggraph", "igraph",
"pheatmap", "VennDiagram", "UpSetR", "RColorBrewer",
"gridExtra", "svglite", "httr", "stringr", "futile.logger"
))Optional: RCy3 for Cytoscape integration.
| Parameter | Value | Source |
|---|---|---|
| Q-value — repression | ≤ 0.10 | Gulfidan et al. (2020) |
| Q-value — activation | ≥ 0.90 | Gulfidan et al. (2020) |
| Δ (minimum co-expression frequency) | ≥ 0.20 | Gulfidan et al. (2020) |
| DEG log₂FC threshold | > 1.0 | Standard |
| DEG adjusted p-value | < 0.05 | Standard |
| Rewired protein percentile | Top 25% of differential degree | This study |
| STRING PPI score | ≥ 400 (medium confidence) | STRING v12.0 |
All outputs are written to outputs/ relative to the project root:
outputs/
├── differential_interactome/Study_N/ # Q-value tables per study
├── differential_expression/ # DEG summaries
├── rewired_analysis/Study_N/ # Rewired protein lists per study
├── enrichment/ # KEGG + GO enrichment results
├── networks/ # Node/edge/hub CSVs for network figures
├── modules/ # Louvain module assignments
├── figures/ # All generated figures (PNG + SVG)
└── tables/ # Summary tables