Skip to content

Add upcoming model deprecations#379

Merged
max-braintrust merged 8 commits intomainfrom
bra-3966-update-model-deprecation
Jan 30, 2026
Merged

Add upcoming model deprecations#379
max-braintrust merged 8 commits intomainfrom
bra-3966-update-model-deprecation

Conversation

@max-braintrust
Copy link
Contributor

@max-braintrust max-braintrust commented Jan 29, 2026

  • Adds optional deprecation_date field to ModelSpec so deprecation changes can be made ahead of the actual date on which models should be deprecated.
  • Adds deprecation dates for Haiku 3.5, Sonnet 3.7, gemini-2.0-flash, and gemini-2.0-flash-lite.
  • Updates sync_models script to also pull in deprecation_dates.
  • Updates api to mark models as deprecated if deprecation date has passed.

@vercel
Copy link

vercel bot commented Jan 29, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
ai-proxy Ready Ready Preview, Comment Jan 30, 2026 10:44pm

Request Review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 383221f841

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

"max_input_tokens": 200000,
"max_output_tokens": 128000
"max_output_tokens": 128000,
"deprecationDate": "19 Feb 2026 00:00:01 GMT"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q: thoughts on using YYYY-MM-DD instead like the litellm list has? we have this script sync_models.ts that I think is currently broken but then we could use it to pull dates from there

alternatively we could update the script to convert that date format into what you have here so either way could work

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - any format that the Date api can parse should work, so YYYY-MM-DD should work as is - I will update the config to keep it consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok updated the dates - let me look at that script. If its an easy fix I can import those other deprecation dates as well

"displayName": "GPT-5 mini",
"reasoning": true,
"max_input_tokens": 400000,
"max_input_tokens": 272000,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe @ibolmo you can confirm this is ok? the sync_models script made this change which pulls from litellm, but it seems like there's a discrepancy with the OpenAI documentation where the context window used to correspond to the max_input_tokens, but with the gpt-5 models it's the combined input and output. so while it says 400,000 context window you infer the max_input_tokens is 272,000 because the max_output_tokens is 128,000. not sure if you've seen issues with the current config we have

Image

https://community.openai.com/t/huge-gpt-5-documentation-gap-flaw-causing-bugs-input-tokens-exceed-the-configured-limit-of-272-000-tokens/1344734

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@max-braintrust if you want to revert just these max_input_tokens changes so you can land this change go ahead, we can update again once we get confirmation that it won't affect current functionality

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, I would agree that 272,000 is likely true. we've had to split firework's max tokens in half on purpsoe for this reason (they don't specify input vs. output max tokens)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok that's good to know, it would be nice if we could use the script and litellm as a source of truth so that's good

Copy link
Collaborator

@ibolmo ibolmo Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed. the tough part is that litellm at times is slower than us in updating their registry. thus we have update this manually. i could see a cron/routine job, though, being a watchdog/helper and we can fix our manual updates and/or capture models we didn't think to add

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like the sync script is getting mature enough to try to automate this

@max-braintrust max-braintrust merged commit a7e7043 into main Jan 30, 2026
5 checks passed
"format": "openai",
"flavor": "chat"
"flavor": "chat",
"input_cost_per_mil_tokens": 0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we keep these? seems odd it would be 0.00

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants