- [2026.03.13] 🔥 We have released RealGen-V2, introducing a dynamic GAN-inspired online RL paradigm! Check out the [ RealGen-V2 Documentation ]
- [2025.12.02] 🔥 We have released RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards. Check out the [ Paper; ].
- ✨ What did we do? We propose RealGen, a text-to-image generator capable of producing highly convincing photorealistic images. It leverages a Detector Reward-guided GRPO post-training to escape detector identification, thereby reducing artifacts and enhancing image realism and detail.
- 📐 How to evaluate performance? We introduce RealBench, a new benchmark for evaluating photorealism that achieves human-free automated scoring through Detector-Scoring and Arena-Scoring.
- 🔧 How effective was it? RealGen significantly outperforms both general image models (like GPT-Image-1, Qwen-Image) and specialized realistic models (like FLUX-Krea) in realism, details, and aesthetics on the T2I task.
We are pleased to find that the strategy of utilizing AIGC detectors as reward signals has been independently explored by other excellent concurrent works. We acknowledge and recommend checking out:
- LongCat-Image: They innovatively incorporate an AIGC detection model as a reward during the RL phase, utilizing adversarial signals to guide the model toward generating images with the texture and fidelity of the real physical world.
- Z-Image: In their RLHF pipeline, they design a comprehensive reward model where AI-Content Detection perception serves as a critical dimension, alongside instruction-following capability and aesthetic quality.
It is exciting to see the community converging on this effective paradigm to bridge the gap between generated and real distributions.
| Z-Image (Baseline) | |||
|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| After RealGen-V2 (Ours) | |||
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| Z-Image (Baseline) | |||
|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| After RealGen-V2 (Ours) | |||
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
- It should be stated that our proposed detection-for-generation framework is compatible with all diffusion-model-based GRPO paradigms, such Dance GRPO and Flow GRPO.
For the original V1 implementation, please refer to:
👉 RealGen-V1 Documentation
For the latest V2 implementation, please refer to:
👉 RealGen-V2 Documentation
The inference and evaluation processes are realized according to the code in /RealGen/eval.
This repo is based on Flow GRPO. We thank the authors for their valuable contributions to the AlGC community.
@article{ye2025realgen,
title={RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards},
author={Ye, Junyan and Zhu, Leqi and Guo, Yuncheng and Jiang, Dongzhi and Huang, Zilong and Zhang, Yifan and Yan, Zhiyuan and Fu, Haohuan and He, Conghui and Li, Weijia},
journal={arXiv preprint arXiv:2512.00473},
year={2025}
}

































