Skip to content

Conversation

@NP2241
Copy link

@NP2241 NP2241 commented Jan 22, 2026

Resolves #979

It's a good idea to open an issue first for discussion.

This PR adds a new GRPO example demonstrating natural language to SQL training with an external execution environment. The example shows how to integrate Tunix GRPO with a structured task where rewards are computed by executing generated SQL against a small SQLite database and checking result correctness.

The example is intentionally minimal and mirrors the structure of the existing GSM8K GRPO example to serve as a clear reference for users.

Reference

Colab Notebook
N/A — this PR adds an example recipe under examples/ and does not introduce a new public API.

Checklist

  • I have added all the necessary unit tests for my change.
    N/A — this PR adds an example and does not modify core library logic.
  • I have verified that my change does not break existing code and all unit tests pass.
  • I have added all appropriate doc-strings/documentation.
  • My PR is based on the latest changes of the main branch.
  • I have signed the Contributor License Agreement.
  • I have followed Contribution Guidelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Natural Language to SQL GRPO Example

1 participant