GitHub - LCM-Lab/LongRM: Revealing and unlocking the context boundary of reward models

💻 Environment & Installation

To setup the training environment, run:

cd LongRM
pip install -r requirements.txt
# install flash attention
Download the suitable version of flash_attn from https://github.com/Dao-AILab/flash-attention/releases
pip install <path_to_flash_attn_whl_file>
pip install ring_flash_attn

🔥 Train

For Generative Model

To run the first training process:

bash scripts/sft.sh

To run the second training process:

bash bash scripts/simpo_grm.sh

For Discriminative Model

Directly run the second training process:

bash bash scripts/simpo_disrm.sh

📊 Evaluate

We provide the benchmark dataset and trained models in our Modelscope.

For Generative Model

modelscope download LCM_group/LongReward_Qwen3-8B --repo-type model --local_dir ./LongReward_Qwen3-8B

python evaluate/eval.py --model-path ./LongReward_Qwen3-8B --data-path ./LongReward-Bench

For Discriminative Model

modelscope download LCM_group/LongReward_Skywork-Reward-V2-Llama-3.1-8B --repo-type model --local_dir ./LongReward_Skywork-Reward-V2-Llama-3.1-8B

python evaluate/eval.py --model-path ./LongReward_Skywork-Reward-V2-Llama-3.1-8B --data-path ./LongReward-Bench --is-disrm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

💻 Environment & Installation

🔥 Train

For Generative Model

For Discriminative Model

📊 Evaluate

We provide the benchmark dataset and trained models in our Modelscope.

For Generative Model

For Discriminative Model

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
datas		datas
evaluate		evaluate
scripts		scripts
train		train
README.md		README.md
requirements.txt		requirements.txt

LCM-Lab/LongRM

Folders and files

Latest commit

History

Repository files navigation

💻 Environment & Installation

🔥 Train

For Generative Model

For Discriminative Model

📊 Evaluate

We provide the benchmark dataset and trained models in our Modelscope.

For Generative Model

For Discriminative Model

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages