fix(memory): handle punctuation and whitespace in extractWords by koriyoshi2041 · Pull Request #624 · google/adk-go

koriyoshi2041 · 2026-03-06T07:59:12Z

Summary

Fixes #569.

extractWords in memory/inmemory.go currently splits text only on spaces (strings.SplitSeq(text, " ")), which means:

Words separated by tabs or newlines are not properly tokenized
Punctuation attached to words (e.g., "great!", "banana,") prevents keyword matching

Changes

Replace strings.SplitSeq(text, " ") with strings.Fields(text) to handle all Unicode whitespace (tabs, newlines, multiple spaces)
Add strings.TrimFunc to strip non-letter/non-number characters from word boundaries
Add 3 test cases covering punctuation, multi-line text, and comma-separated values

Test plan

go test ./memory/ -run Test_inMemoryService_SearchMemory — all 8 tests pass (5 existing + 3 new)
go vet ./memory/ — clean

gemini-code-assist · 2026-03-06T07:59:16Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

google-cla · 2026-03-06T07:59:29Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

hyangah · 2026-03-06T13:02:31Z

 	res := make(map[string]struct{})

-	for s := range strings.SplitSeq(text, " ") {
+	for _, s := range strings.Fields(text) {


Can you keep using strings.SplitSeq? That avoids unnecessary allocation.

Good point! Switched to strings.FieldsSeq which gives us the iterator (no slice allocation) while also splitting on all unicode whitespace (tabs, newlines, etc.) — best of both worlds.

extractWords previously used strings.SplitSeq with a space delimiter, which missed tabs, newlines, and other whitespace. It also stored words with surrounding punctuation (e.g. "great!" instead of "great"), causing keyword search to miss relevant results. Replace strings.SplitSeq with strings.Fields to split on all Unicode whitespace, and add strings.TrimFunc to strip leading/trailing non-letter, non-number characters from each word. Add test cases for punctuation stripping, multi-line text with tabs/newlines, and comma-separated values. Fixes google#569

Switch from strings.Fields (allocates []string) to strings.FieldsSeq (returns iterator) per reviewer feedback, while keeping the whitespace and punctuation handling improvements.

koriyoshi2041 · 2026-06-08T08:16:09Z

Rebased this onto current main and cleaned up the gofmt alignment in the added memory tests. The change still uses strings.FieldsSeq, so it keeps the iterator/no-slice-allocation path from your review note while handling punctuation and other whitespace.

Checked locally:

go test ./memory
go test -race -mod=readonly ./memory
go vet ./memory
git diff --check

hyangah reviewed Mar 6, 2026

View reviewed changes

koriyoshi2041 added 2 commits June 8, 2026 16:14

refactor: use strings.FieldsSeq to avoid slice allocation

93ee368

Switch from strings.Fields (allocates []string) to strings.FieldsSeq (returns iterator) per reviewer feedback, while keeping the whitespace and punctuation handling improvements.

koriyoshi2041 force-pushed the fix/extractwords-punctuation-handling branch from 092b1cc to 93ee368 Compare June 8, 2026 08:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(memory): handle punctuation and whitespace in extractWords#624

fix(memory): handle punctuation and whitespace in extractWords#624
koriyoshi2041 wants to merge 2 commits into
google:mainfrom
koriyoshi2041:fix/extractwords-punctuation-handling

koriyoshi2041 commented Mar 6, 2026

Uh oh!

gemini-code-assist Bot commented Mar 6, 2026

Uh oh!

google-cla Bot commented Mar 6, 2026

Uh oh!

hyangah Mar 6, 2026

Uh oh!

koriyoshi2041 Mar 8, 2026

Uh oh!

koriyoshi2041 commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

koriyoshi2041 commented Mar 6, 2026

Summary

Changes

Test plan

Uh oh!

gemini-code-assist Bot commented Mar 6, 2026

Uh oh!

google-cla Bot commented Mar 6, 2026

Uh oh!

hyangah Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

koriyoshi2041 Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

koriyoshi2041 commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants