Skip to content

12030/fix/title search requires phrase match of all words#12416

Open
AKADaniel-hub wants to merge 3 commits intointernetarchive:masterfrom
AKADaniel-hub:12030/fix/title-search-requires-phrase-match-of-all-words
Open

12030/fix/title search requires phrase match of all words#12416
AKADaniel-hub wants to merge 3 commits intointernetarchive:masterfrom
AKADaniel-hub:12030/fix/title-search-requires-phrase-match-of-all-words

Conversation

@AKADaniel-hub
Copy link
Copy Markdown

@AKADaniel-hub AKADaniel-hub commented Apr 20, 2026

Closes #12030

fix: search requires phrase match of all words

Technical

The error is related to how Solr search handles phrase queries. Currently, using title:"query" matches the exact phrase, requiring terms to appear in that specific order.

To allow the title to contain terms regardless of their order (e.g., "harry" AND "potter"), I updated the logic to use grouped terms: title:(term1 term2).

Changes:

Modified marshalBookSearchQuery to use the title:(...) format for Solr.

Updated initFromUrlParams to properly handle the removal of the title: prefix and parentheses when parsing the URL.

Testing

In the right corner search bar, choose Title.

Search for "responsive design".

    Original behavior: Returns "No books directly matched your search" (looking for exact phrase).

    After fix: Returns the book Responsive Web Design.

Screenshot

Local test Screenshot:

old
imagem

after
imagem
Production

old:
imagem

expected:
imagem

Stakeholders

@cdrini


Please let me know if i went to right direction to address the issue or if there are any improvements I should make.

@mekarpeles
Copy link
Copy Markdown
Member

Thanks for contributing to Open Library, @AKADaniel-hub! Welcome — this appears to be your first PR to the project.

🤖 Copilot has been assigned for an initial review.

@cdrini is assigned to this PR and currently has:

  • 4 open PR(s) of equal or higher priority to review first
PR triage checklist (maintainers / Pam)
  • PR description — not empty; explains what the change does and how to verify it
  • References an issue — PR body contains a #NNN reference
    • Linked issue is triaged — has a Priority: * label (not just Needs: Triage)
    • Linked issue is assigned — has at least one assignee
  • Commit history clean — no WIP/fixup/conflict noise; commit messages are meaningful
  • CI passing — no failing check-runs
  • Test cases present — if the change touches substantive logic, test coverage exists or is explained
  • Proof of testing — PR body includes a description of what was tested, a screenshot, or a video

Note

This comment was automatically generated by Pam, Open Library's Project AI Manager, on behalf of @mekarpeles. Pam is designed to provide status visibility, perform basic project management functions and relevant codebase research, and provide actionable feedback so contributors aren't left waiting.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the header “Title” search facet query marshaling to avoid Solr phrase-query behavior (order-dependent matches) by emitting grouped term queries (e.g. title:(term1 term2)) and adjusting URL parsing/tests accordingly.

Changes:

  • Update SearchBar.marshalBookSearchQuery to emit title:(...) instead of title:"...".
  • Update SearchBar.initFromUrlParams (and unit tests) to strip the title: prefix and surrounding parentheses.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
openlibrary/plugins/openlibrary/js/SearchBar.js Changes title-facet query marshaling and URL query parsing for the search bar.
tests/unit/js/SearchBar.test.js Updates unit tests to reflect the new title:(...) query format.
Comments suppressed due to low confidence (1)

tests/unit/js/SearchBar.test.js:121

  • The new title:(...) marshaling behavior should be covered for inputs containing lucene special characters / boolean operators (e.g. C++ Primer, E-mail, A AND B) to ensure the generated query remains Solr-safe after removing the protective quotes.
    describe('marshalBookSearchQuery', () => {
        const fn = SearchBar.marshalBookSearchQuery;
        test('Empty string', () => {
            expect(fn('')).toBe('');
        });

        test('Adds title prefix to plain strings', () => {
            expect(fn('Harry Potter')).toBe('title:(Harry Potter)');
        });

        test('Does not add title prefix to lucene-style queries', () => {
            expect(fn('author:"Harry Potter"')).toBe('author:"Harry Potter"');
            expect(fn('"Harry Potter"')).toBe('"Harry Potter"');
        });

Comment on lines 186 to 195
if (urlParams.q && this.getCurUrl().pathname.match(/^\/search/)) {
let q = urlParams.q.replace(/\+/g, ' ');
if (this.facet.read() === 'title' && q.indexOf('title:') !== -1) {
const parts = q.split('"');
if (parts.length === 3) {
q = parts[1];
q = q.replace('title:', '').trim();
if (q.startsWith('(') && q.endsWith(')')){
q = q.substring(1, q.length -1);
}

}
this.$input.val(q);
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initFromUrlParams now only strips title:(...) and will leave legacy title:"..." queries showing quotes in the input (e.g. title:"Harry Potter" becomes "Harry Potter"). Since existing shared URLs may still use the quoted form, consider supporting both wrappers (parentheses and quotes) when q starts with the title: field.

Copilot uses AI. Check for mistakes.
Comment on lines 188 to 192
if (this.facet.read() === 'title' && q.indexOf('title:') !== -1) {
const parts = q.split('"');
if (parts.length === 3) {
q = parts[1];
q = q.replace('title:', '').trim();
if (q.startsWith('(') && q.endsWith(')')){
q = q.substring(1, q.length -1);
}
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The title-prefix stripping logic checks q.indexOf('title:') !== -1 and then does q.replace('title:', ''), which will also strip title: if it appears mid-query (or inside another field name), not just as a leading field. To avoid mangling arbitrary lucene queries in the URL, restrict this to cases where the trimmed query starts with title: (and ideally handle optional whitespace after the colon).

Copilot uses AI. Check for mistakes.
Comment on lines 314 to 317
static marshalBookSearchQuery(q) {
if (q && q.indexOf(':') === -1 && q.indexOf('"') === -1) {
q = `title: "${q}"`;
q = `title:(${q})`;
}
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

marshalBookSearchQuery now interpolates the raw user input inside title:(...) without escaping lucene/solr special characters. This is a behavioral regression vs the previous quoted form for inputs like C++ Primer, E-mail, M*A*S*H, or A AND B, which can be parsed as operators/wildcards and may error or change meaning. Before wrapping, escape special characters (and lowercase boolean operators) similarly to openlibrary/solr/query_utils.py::fully_escape_query.

Copilot uses AI. Check for mistakes.
Comment on lines +190 to +191
if (q.startsWith('(') && q.endsWith(')')){
q = q.substring(1, q.length -1);
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting: if (q.startsWith('(') && q.endsWith(')')){ is missing spaces before { and around - 1 compared to the surrounding style in this file; this will likely be flagged by linting and hurts readability.

Suggested change
if (q.startsWith('(') && q.endsWith(')')){
q = q.substring(1, q.length -1);
if (q.startsWith('(') && q.endsWith(')')) {
q = q.substring(1, q.length - 1);

Copilot uses AI. Check for mistakes.
expect(sb.$input.val()).toBe('Harry Potter');
sb.initFromUrlParams({q: 'title: "Harry"', facet: 'title'});
sb.initFromUrlParams({q: 'title: (Harry)', facet: 'title'});
expect(sb.$input.val()).toBe('Harry');
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated initFromUrlParams test no longer covers the legacy title:"..." URL format. Add a test case asserting that title:"Harry Potter" (and title: "Harry") are still parsed into plain input text, so we don't regress existing shared URLs.

Suggested change
expect(sb.$input.val()).toBe('Harry');
expect(sb.$input.val()).toBe('Harry');
sb.initFromUrlParams({q: 'title:"Harry Potter"', facet: 'title'});
expect(sb.$input.val()).toBe('Harry Potter');
sb.initFromUrlParams({q: 'title: "Harry"', facet: 'title'});
expect(sb.$input.val()).toBe('Harry');

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Title search requires phrase match of all words

5 participants