Skip to content

TeBaAb: Text-Based Antigen-Conditioned Antibody Redesign via Directed Evolution

Notifications You must be signed in to change notification settings

HySonLab/TeBaAb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TeBaAb: Text-Based Antigen-Conditioned Antibody Redesign via Directed Evolution

An earlier version of this work was presented at NeurIPS 2025 workshops as a non-archival presentation:

Abstract

The design of antibodies with high affinity and specificity for target antigens is a cornerstone of therapeutic and diagnostic innovation. Traditional optimization strategies, such as phage or yeast display and directed evolution, remain resource-intensive and limited in their ability to integrate contextual information. Recent AI-driven approaches have accelerated protein engineering, but most rely exclusively on structured inputs, overlooking the potential of natural language as a flexible design interface. In this work, we introduce TeBaAb, a novel text-based antigen-conditioned framework for antibody redesign that combines generative modeling with iterative optimization inspired by directed evolution. TeBaAb integrates a Conditional Variational Autoencoder (CVAE) jointly conditioned on antigen sequences and textual descriptions of antibody properties, coupled with a two-stage binding affinity predictor and an iterative enrichment loop. To support this approach, we curated AbDes, a new dataset of 7,684 text–antibody–antigen pairs with accompanying structural and binding information.

In silico experimental evaluations demonstrate that TeBaAb improves the predicted binding affinity by an average of $10%$ compared to the original antibodies, while preserving structural confidence (RMSPE $<$ 1.0 Å) and generating sequences that are diverse and novel. By enabling text-conditioned antigen-specific antibody design, TeBaAb provides a promising new paradigm for accelerating therapeutic antibody discovery and expanding the antibody design space beyond traditional methods.

Table of Contents


Installation

Requirements

  • Python 3.10 or higher
  • A virtual environment (e.g., venv or conda) is recommended
  • Dependencies listed in requirements.txt

Steps

  1. Clone the repository:

    git clone https://github.com/HySonLab/TeBaAb.git
    cd TeBaAb
  2. Set up a virtual environment and install dependencies:

    conda env create -f environment.yml 

    or

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    pip install -r requirements.txt
  3. Create the checkpoints directory:

    mkdir -p checkpoints

Usage

Datasets

  • AbDes: Contains antibody-antigen sequences paired with text descriptions.

Ensure datasets are placed in the specified paths or update configuration files accordingly.

Configuration

Configuration is managed via YAML files in configs/:

  • training.yaml: Defines architectures for protein/text encoders, CVAE, and fitness predictor.
  • optimize.yaml: Configures directed evolution parameters (e.g., generations, mutation rate).

Preparing Data

Extract and preprocess the dataset:

cd datasets

python3 extract_embedding.py \
    --input_csv ./abdes/train.csv \
    --output_dir ./cvae \
    --output_prefix train \
    --modality all \
    --embedding_type pooler \
    --device cuda:0 \
    --esmc_cache /path/to/esmc_300m_2024_12_v0.pth

Training

  1. Train the CVAE:

    ./scripts/run_train_cvae.sh
    • Trains on datasets/cvae.
  2. Train the Oracle:

    python3 scripts/train_predictor.py
    • Trains on datasets/affinity.

Antibody Design

Generate and optimize protein sequences using directed evolution:

./scripts/optimize.py

Evaluation

Training Metrics

  • CVAE: Reconstruction loss, KL divergence, validation loss.
  • Oracle: Mean Squared Error (MSE) for fitness prediction.

Antibody Design Metrics

  • Binding affinity scores from the oracle.
  • Diversity, novelty.
  • 3D Structure Error: ABodyBuilder2
  • Antibody Developability: TAP

Please cite our work

@inproceedings{
nguyen2025tebaab,
title={TeBaAb: Text-Based Antigen-Conditioned Antibody Redesign via Directed Evolution},
author={Cuong Manh Nguyen and Huy-Hoang Do-Huu and Viet Thanh Duy Nguyen and Truong-Son Hy},
booktitle={NeurIPS 2025 AI for Science Workshop},
year={2025},
url={https://openreview.net/forum?id=Imw5NGMgje}
}