Skip to content

feat(python-sdk)!: async-first evaluators with evaluate_sync wrapper#63

Merged
czi-fsisenda merged 2 commits into
mainfrom
fsisenda/python_async
May 15, 2026
Merged

feat(python-sdk)!: async-first evaluators with evaluate_sync wrapper#63
czi-fsisenda merged 2 commits into
mainfrom
fsisenda/python_async

Conversation

@czi-fsisenda
Copy link
Copy Markdown
Contributor

Summary:

async-first evaluators with evaluate_sync wrapper

Test Plan:

  • Wrote automated tests
    • Unit tests
  • Manually tested my changes, and here are the details:

- Make evaluate, evaluate_impl, execute_step, and execute_prompt_chain_step async; add evaluate_sync via asyncio.run for sync callers.
- Use ainvoke for LangChain prompt chains.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes the Python SDK evaluator execution async-first by converting BaseEvaluator.evaluate, prompt-chain execution, and evaluator implementations to async, while adding evaluate_sync for synchronous callers.

Changes:

  • Converted base evaluator flow and prompt-chain execution to async/await.
  • Updated built-in vocabulary and conventionality evaluators to await prompt steps.
  • Updated tests and README examples to use evaluate_sync.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
sdks/python/src/learning_commons_evaluators/evaluators/base.py Introduces async evaluation and sync wrapper.
sdks/python/src/learning_commons_evaluators/evaluators/conventionality.py Converts conventionality implementation to async prompt execution.
sdks/python/src/learning_commons_evaluators/evaluators/vocabulary.py Converts vocabulary implementation and helper chain to async prompt execution.
sdks/python/README.md Updates usage examples for evaluate_sync.
sdks/python/tests/evaluators/test_base.py Updates base evaluator tests for async helpers and sync wrapper.
sdks/python/tests/evaluators/test_conventionality.py Updates evaluator calls to evaluate_sync.
sdks/python/tests/evaluators/test_vocabulary.py Updates evaluator calls to evaluate_sync.
sdks/python/tests/contract_tests/harness.py Updates harness usage example.
sdks/python/tests/contract_tests/test_conventionality.py Updates contract evaluator call to evaluate_sync.
sdks/python/tests/contract_tests/test_vocabulary.py Updates contract evaluator calls to evaluate_sync.
Comments suppressed due to low confidence (1)

sdks/python/src/learning_commons_evaluators/evaluators/base.py:84

  • The new async public entrypoint is only exercised indirectly through evaluate_sync; there is no test that calls await evaluator.evaluate(...) from an existing event loop. Because this PR makes evaluate the primary async API, add a direct async test so regressions in the awaited API are caught independently of the sync wrapper.
    async def evaluate(
        self,
        input: InputT,
        evaluation_settings: SettingsT | None = None,
    ) -> OutputT:

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread sdks/python/README.md
Comment thread sdks/python/src/learning_commons_evaluators/evaluators/vocabulary.py Outdated
@czi-fsisenda czi-fsisenda changed the title feat(python-sdk): async-first evaluators with evaluate_sync wrapper feat!(python-sdk): async-first evaluators with evaluate_sync wrapper May 15, 2026
@czi-fsisenda czi-fsisenda changed the title feat!(python-sdk): async-first evaluators with evaluate_sync wrapper feat(python-sdk)!: async-first evaluators with evaluate_sync wrapper May 15, 2026
@czi-fsisenda czi-fsisenda merged commit 70ef965 into main May 15, 2026
9 checks passed
@czi-fsisenda czi-fsisenda deleted the fsisenda/python_async branch May 15, 2026 23:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants