RFC: Deprecate allow_training_without_logprobs option

## Summary

We're considering removing the `allow_training_without_logprobs` option from ART. This RFC is to gather community feedback before making this change.

## Background

The `allow_training_without_logprobs` option allows training without requiring generation logprobs from the model. However, this approach has several drawbacks:

1. **Importance sampling requires logprobs for stable training**: In our experiments and in the wider RL community, having generation logprobs is essential for importance sampling, which is critical for stable training results. Training without them leads to less reliable outcomes.

2. **Code complexity**: Maintaining this alternative path adds complexity to the codebase and makes it harder to reason about the training flow.

3. **Subtle bugs**: The additional code path creates opportunities for subtle bugs. For example, in [PR #527](https://github.com/OpenPipe/ART/pull/527) we discovered tool-call tokenization issues that were partially enabled by this mode's complexity.

## Proposal

Remove the `allow_training_without_logprobs` option entirely, simplifying the codebase and ensuring all users benefit from the more robust training path that uses logprobs.

## Request for Feedback

**Is anyone in the community actively using `allow_training_without_logprobs` with good results?**

If you're using this option and it's working well for your use case, please let us know:
- What is your use case?
- Why do you need to train without logprobs?
- What results are you seeing?

If we don't hear from users who depend on this feature, we plan to remove it in an upcoming release.

---

Related: #527

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Deprecate allow_training_without_logprobs option #528

Summary

Background

Proposal

Request for Feedback

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RFC: Deprecate allow_training_without_logprobs option #528

Description

Summary

Background

Proposal

Request for Feedback

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions