feat: modernize langchain integration core tools by daveomri · Pull Request #28 · apify/langchain-apify

daveomri · 2026-04-21T14:19:52Z

Summary

First PR into feat/modernize-langchain-integration; adds the foundational tools layer and modernizes auth conventions across the package. Upcoming PRs will add search & crawling tools, social media tools, LangChain-native components, and docs to feat/modernize-langchain-integration before merging it all to main.

New code: ~880 lines - Tests: ~1000 lines

Note on scope: While building the new tools layer, I spotted several pre-existing issues in the legacy code (plain-string token handling, outdated get_from_dict_or_env + mode='before' validator pattern, tokens leaking into model_dump() / repr()). Because the new tools reuse the same SecretStr-based auth, keeping two parallel conventions in the package would have been confusing and short-lived, so I folded the fixes into this PR. Remaining, more independent improvements I noticed along the way (e.g., docs & examples refresh, LangChain-native component additions, actor-specific tool classes) will be split out as follow-up tasks on the integration branch rather than bundled here.

ApifyToolsClient (_client.py)
- Internal helper wrapping ApifyClient, one method per tool operation. Accepts both SecretStr and raw str tokens and falls back to the APIFY_API_TOKEN env var. Shared _list_items_or_raise helper wraps dataset-fetch errors into RuntimeError.
6 new BaseTool subclasses
- ApifyRunActorTool, ApifyGetDatasetItemsTool, ApifyRunActorAndGetItemsTool, ApifyScrapeUrlTool, ApifyRunTaskTool, ApifyRunTaskAndGetItemsTool. Exported via the APIFY_CORE_TOOLS: list[type[BaseTool]] convenience list for selective agent binding.
_ApifyGenericTool base class
- Common client handling, handle_tool_error=True, developer-controlled safety clamping (_clamp_timeout, _clamp_memory, _clamp_items) with configurable ceilings (max_timeout_secs, max_memory_mbytes, max_items) and hardcoded floor of 1 to enforce API protocol minimums.
Auth pattern modernized (document_loaders.py, wrappers.py, tools.py)
- Replaced legacy get_from_dict_or_env + @model_validator(mode='before') with SecretStr field type and secret_from_env('APIFY_API_TOKEN', default=None) default factory, matching langchain-openai / langchain-anthropic conventions. Tokens are automatically redacted in logs/traces and additionally excluded from model_dump() / repr() via exclude=True, repr=False. Client construction moved to @model_validator(mode='after') / model_post_init. Added populate_by_name=True to ConfigDict on loader and wrapper. The new tools reuse this same auth pattern; fixing it here avoids shipping two parallel conventions across the package.
Backward compatible
- ApifyActorsTool, ApifyDatasetLoader, ApifyWrapper retain their public API; auth changes are internal.
Tests
- Unit tests for all tools & client (~1000 lines across test_tools.py, test_client.py, test_document_loaders.py), integration smoke tests under tests/integration_tests/, and error-scenario coverage (missing token, run failure, network error, clamp floor/ceiling, token excluded from model_dump, APIFY_TOKEN env-var fallback on the loader).

Review strategy

The diff is larger than a typical PR (~1.9k lines, half of which is tests). Suggested reading order to make it tractable:

_client.py: the new ApifyToolsClient abstraction
_ApifyGenericTool base class in tools.py, then the 6 tool classes (homogeneous, once one clicks, the rest read fast)
Auth diff in document_loaders.py, wrappers.py, and the ApifyActorsTool.__init__ change in tools.py
Tests last: mostly linear, grouped by the module they cover

Merge strategy

This PR targets feat/modernize-langchain-integration, not main. The plan is to accumulate all reviewed modernization work (core tools -> actor-specific tools -> LangChain-native components -> docs) on that integration branch, then open a single PR from feat/modernize-langchain-integration -> main once everything is complete and reviewed. Any smaller, pre-existing issues I find along the way will be split out as separate follow-up tasks on the integration branch rather than bundled into the larger PRs.

…input schemas

…mline client handling and error management

…media tools for apify integration

… Apify tools

…un_task methods

…ms and message for empty dataset

… api interaction

…y tools to enforce safety constraints

…and maintability; update test cases for better formatting and error handling

…tandards

…oolsclient

…to apify_api_token

…ders

daveomri added 18 commits April 20, 2026 16:12

feat: implement apifyclient wrapper

8cad430

feat: removed redundant const file

2404b9c

feat: add few more input schemas, helpers and tool classes

b1a89a4

feat: export new tools from __init__

0aa9175

feat: add unit tests

4e46d36

feat: implement tests and introduce tools list

fc6ef12

fix: lint fix

cc5be9e

feat: enhance error handling and documentation for apify tools

c2b9cb6

fix: iso format fix

3edf126

feat: add apify run task and apify run task and get items tools with …

8c36edc

…input schemas

feat: introduce _ApifyGenericTool base class for Apify tools to strea…

026175a

…mline client handling and error management

feat: add _actor_tools.py file to define upcomming search and social …

110c971

…media tools for apify integration

fix: add try/except to match others

a08f63e

fix: update timeout constants and improve input schema descripiton in…

d028531

… Apify tools

fix: enhance error handling for missing dataset id in run_actor and r…

429a3ed

…un_task methods

fix: update apifygetdatasetitemstool to return a json object with ite…

b914e47

…ms and message for empty dataset

feat: add integration smoke tests for generic Apify tools to validate…

0f71181

… api interaction

feat: implement clamping for timeout, memory, and item limits in apif…

50c52f2

…y tools to enforce safety constraints

daveomri self-assigned this Apr 21, 2026

daveomri added 11 commits April 22, 2026 07:37

feat: clean up _actor_tools.py and tools.py for improved readibility …

ba179a6

…and maintability; update test cases for better formatting and error handling

ref: align private scope conventions with langchain partner package s…

005294b

…tandards

ref: migrate auth to SecretStr + secret_from_env pattern

2f74c29

fix: backward-compat fix

6258b2b

fix: update stale doc string

2905b67

chore: removed redundant file

3238c02

fix: extracted repeated code, fixed secretstr compatibility to apifyt…

92df406

…oolsclient

fix: set min value to timeout, memory and items, add exlude and repr …

3a0f666

…to apify_api_token

feat: added repr and exclude to apify api token

8614cfd

feat: add type checking to apify core tools list

2bf130a

feat: add tests for clamped values and apify api token

98293d4

fix: lint fix

863ed8d

daveomri changed the title ~~Feat: modernize langchain integration core tools~~ feat: modernize langchain integration core tools Apr 23, 2026

ref: update apify_api_token type to support SecretStr in document loa…

70527e0

…ders

daveomri mentioned this pull request Apr 24, 2026

feat: modernize langchain integration native components #29

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: modernize langchain integration core tools#28

feat: modernize langchain integration core tools#28
daveomri wants to merge 31 commits intofeat/modernize-langchain-integrationfrom
feat/modernize-langchain-integration-core-tools

daveomri commented Apr 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

daveomri commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Review strategy

Merge strategy

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

daveomri commented Apr 21, 2026 •

edited

Loading