Skip to content

refactor: migrate elasticsearch to opensearch#2562

Open
Junjiequan wants to merge 52 commits intomasterfrom
elasticSearch-to-openSearch
Open

refactor: migrate elasticsearch to opensearch#2562
Junjiequan wants to merge 52 commits intomasterfrom
elasticSearch-to-openSearch

Conversation

@Junjiequan
Copy link
Member

@Junjiequan Junjiequan commented Feb 24, 2026

Description

THis PR migrates the search integration from Elasticsearch to OpenSearch.

Changes

Dependencies

  • Removed @elastic/elasticsearch and @nestjs/elasticsearch
  • Added @opensearch-project/opensearch ^3.5.1 JavaScript client compatible with opensearchproject/opensearch:3.5.0

Code Changes

  • Renamed all references from Elasticsearch to OpenSearch

Configuration

  • Removed hardcoded index settings and mappings. Configuration is now loaded from opensearchConfig.json if present; otherwise defaults from opensearchConfig.example.json are used

Search Integration

  • Added opensearchQuery and opensearchFacet, used when OpenSearch is enabled and contains data
  • New OpenSearch integration supports only $text queries, performing partial search on description and datasetName
  • Restricted OpenSearch endpoints to admin users only. Search functionality is handled internally through the OpenSearch service

Data Synchronization

  • Dataset synchronization from MongoDB to OpenSearch excludes scientificMetadata, history, and datasetlifecycle

Notes:

  • opensearchQuery extracts the $text query, searches OpenSearch, retrieves dataset IDs up to the provided limit (default: 1000), and then applies additional filtering using MongoDB queries

  • opensearchFacet extracts the $text query, searches OpenSearch, and retrieves results up to the index setting max_result_window. The example configuration sets this to 2000000, which may impact performance depending on server capacity and dataset volume

If performance issues arise with large result sets, alternative approaches such as pagination, scroll queries, or other optimizations can be considered as future improvements

Tests included

  • Included for each change/fix?
  • Passing?

Documentation

  • swagger documentation updated (required for API changes)
  • official documentation updated

official documentation info

@Junjiequan Junjiequan changed the title init elasticsearch to opensearch Feb 24, 2026
@Junjiequan Junjiequan changed the title elasticsearch to opensearch WIP: elasticsearch to opensearch Feb 24, 2026
@Junjiequan Junjiequan force-pushed the elasticSearch-to-openSearch branch 2 times, most recently from 640451b to 48c958d Compare March 10, 2026 16:51
@Junjiequan Junjiequan marked this pull request as ready for review March 11, 2026 09:01
@Junjiequan Junjiequan requested a review from a team as a code owner March 11, 2026 09:01
@Junjiequan Junjiequan changed the title WIP: elasticsearch to opensearch refactor: elasticsearch to opensearch Mar 11, 2026
@Junjiequan Junjiequan changed the title refactor: elasticsearch to opensearch BREAKING CHANGE: elasticsearch to opensearch Mar 11, 2026
@Junjiequan Junjiequan changed the title BREAKING CHANGE: elasticsearch to opensearch BREAKING CHANGE: migrate elasticsearch to opensearch Mar 11, 2026
@Junjiequan Junjiequan changed the title BREAKING CHANGE: migrate elasticsearch to opensearch refactor: migrate elasticsearch to opensearch Mar 11, 2026
@Junjiequan Junjiequan self-assigned this Mar 11, 2026

//Tokenizers
export const autocomplete_tokenizer: AnalysisEdgeNGramTokenizer = {
export const autocomplete_tokenizer: any = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by the diff. It says a bunch of files are changed in folder elastic-search but the folder doesn't exist anymore when checking out this branch. When I search for lines like these, they don't exist anywhere either 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The folder is now renamed to opensearch

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I know. But the diff was incorrect. Now it says outdated and the problem is gone...

@Junjiequan Junjiequan force-pushed the elasticSearch-to-openSearch branch 2 times, most recently from 267be4c to c2710bb Compare March 17, 2026 13:56
Copy link
Member

@minottic minottic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot! I am not sure I see where this is used in the v4 datasets, isn't it only applied to v3 datasets?

I also don't fully understand the pletora of docker composes, why so many?

@Junjiequan
Copy link
Member Author

@minottic
Opensearch was previously used by datasets v3 only so this PR only included changes for datasets v3. For V4 integration with opensearch I think its better to have new PR for that, I think this PR is too big already.
Docker composes needs to be cleaned up. I know docker-compose.api.yaml and docker-compose.local.yaml is in use but not sure about the others

@Junjiequan Junjiequan force-pushed the elasticSearch-to-openSearch branch from 72e4205 to 259a95d Compare March 17, 2026 16:56
@minottic
Copy link
Member

Opensearch was previously used by datasets v3 only so this PR only included changes for datasets v3

ah ok, thanks. I was a bit puzzled because the tests are modified for the v4 controllers, but I did not see changes in the code that made me think opensearch was applied to v4. I agree v4 changes can come later if so

Copy link
Member

@minottic minottic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot for the feature and for the patience!

@Junjiequan Junjiequan force-pushed the elasticSearch-to-openSearch branch from f0341b4 to c5ca364 Compare March 23, 2026 12:51
@Junjiequan Junjiequan force-pushed the elasticSearch-to-openSearch branch from c5ca364 to 5fa33ad Compare March 23, 2026 15:02
@Junjiequan Junjiequan force-pushed the elasticSearch-to-openSearch branch from cb52972 to 572130e Compare March 24, 2026 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants