1.1. Create and activate a virtual environment
bash python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
1.2. Install the required packages using pip:
pip install -r requirements.txt2.1. Move data-images folder from https://github.com/msikyna/MetricLearningExperiment2 to the root directory of the project.
2.2. Into the data-images folder, add the 561.txt file from https://drive.google.com/drive/folders/1sFfDmAFW5OcI3wNIEdYVQsZnUoxp-N-5?usp=sharing directly into the data-images folder.
2.3. The final structure should look like this:
/sim-search
├── data-images
│ ├── 561
│ │ ├── 000.jpg
│ │ ├── 001.jpg
│ │ ├── 002.jpg
│ │ └── ...
│ └── 561.txt
├── requirements.txt
├── README.md
├── manage.py
├── djangoProject
├── base
└── ...
python manage.py makemigrationspython manage.py migratepython manage.py runserver-
Open Run Configuration Settings:
- Click the "Add Configuration" button in the top-right toolbar
- Or go to
Run→Edit Configurations...
-
Add Django Server Configuration:
- Click the
+button - Select
Django Serverfrom the list
- Click the
-
Configure Settings:
- Name: Django Server
- Host:
127.0.0.1 - Port:
8000 - Python interpreter: Ensure your project's Python interpreter is selected
- Working directory: Should point to your project root (where
manage.pyis located) - Environment variables: Leave blank unless needed
-
Apply and Run:
- Click
ApplyandOK - Click the green play button (▶) in the toolbar to start the server
- Click
Open your web browser and navigate to http://127.0.0.1:8000/ to use the image similarity search application.
- Select a base image from selection or search by text
This setup is prepared so you can pull the repo and run with Docker directly.
cp .env.example .env
mkdir -p runtime user_matrices cache/L2 cache/IPdocker compose up -d --buildIf the project was moved and the new ./cache bind mount is empty or
root-owned, seed FAISS mapping files from the old deployment cache:
SIMSEARCH_SEED_CACHE_ROOT=/home/xsikyna/sim-search/cache \
docker compose -f docker-compose.yml -f docker-compose.seed-cache.yml up -d --builddocker compose logs -f sim-searchgit pull
docker compose up -d --buildThe app is configured for URL prefix:
http://disa.fi.muni.cz/demos/personalized-similarity-search
through:
DJANGO_FORCE_SCRIPT_NAME=/demos/personalized-similarity-searchProxyPreserveHost On
ProxyPass /demos/personalized-similarity-search/ http://cybela14.fi.muni.cz:8942/
ProxyPassReverse /demos/personalized-similarity-search/ http://cybela14.fi.muni.cz:8942/
RequestHeader set X-Forwarded-Proto "https"- Gunicorn inside container binds
0.0.0.0:8942. - Published port is
${SIMSEARCH_BIND_IP}:${SIMSEARCH_PORT}(default0.0.0.0:8942).
The repo defaults are tuned for a modest multi-user deployment:
GUNICORN_WORKERS=8FAISS_NUM_THREADS=1- file-based Django sessions in
/app/runtime/sessions - SQLite WAL + busy timeout enabled
SIMSEARCH_QUERY_LOG_EXPAND_RESULTS=0to avoid a second search only for logging- in-process user matrix cache enabled (
SIMSEARCH_MATRIX_CACHE_SIZE=32)
Recommended next step on the server:
- Keep only the index you really use enabled.
- Serve
/images/...outside Gunicorn if possible.
Why step 2 matters:
- the search backend can handle concurrency reasonably with multiple workers
- image streaming from shared storage through Gunicorn will still consume workers quickly
If your reverse proxy supports it, prefer one of these:
- Apache/Nginx direct file serving for
/images/ - or enable
SIMSEARCH_IMAGE_X_SENDFILE=1/SIMSEARCH_IMAGE_X_ACCEL_REDIRECT_PREFIX=...
That change usually helps concurrency more than just increasing Gunicorn workers.
There is a standalone HTTP load test script at:
python scripts/loadtest_sim_search.py --helpIt exercises the real personalization flow over HTTP:
- login
- text search
- save positive and negative feedback
- apply feedback to learn/update the matrix
- rerun the query
- optionally fetch the progressive full rerun
python scripts/loadtest_sim_search.py \
--base-url http://disa.fi.muni.cz/demos/personalized-similarity-search/ \
--users 20 \
--rounds 5 \
--num-results 20 \
--metric cosine \
--output-json runtime/loadtest_20_users.jsonNotes:
- By default the script prepares test users named
loadtest_user_001,loadtest_user_002, ... - The first version uses the text-search path because it covers the full search + feedback + matrix-learning flow without needing extra image-anchor fixtures.
- If you want to reuse pre-created users, add
--no-prepare-users.