Skip to content

vllm test cleanup #415

@planetf1

Description

@planetf1

Continues work from #397 and #326.

Problems:

  • vLLM tests fail when GPU unavailable instead of skipping gracefully
  • Multiple vLLM tests cause CUDA out-of-memory errors when run sequentially
  • Tests fail cryptically when Ollama not running instead of skipping
  • No CLI options to selectively skip backend tests during development
  • pytest hooks throw deprecation warnings and duplicate option registration errors
  • Examples in docs/examples/ fail when required backends unavailable
  • Token limit errors in vLLM structured output tests cause intermittent failures

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions