DynBench is a web application for generating robust benchmark records by transforming question-query pairs using AI models.
-
Generate new benchmark items from question-query pairs (natural-language questions with SPARQL queries)
-
Support for generating new questions-query pairs from questions written in multiple languages (20 languages)
-
Adjustable comparability of generated questions through the use of entities for generated questions (importance of entities in generated questions in comparison to is chosen to ensure difficulty to be "easy", "similar", "hard", or "random")
-
Provide random, multilingual samples from benchmark datasets for easier tool exploration
-
Feedback collection for computed transformation steps
The application requires environment variables to be set. Create a .env file in the project root:
DYNBENCH=<Path to LLM transform endpoint>
MODEL=<Your model>Variable |
Description |
Default |
|
URL of the DynBench backend service |
- |
|
AI model to use for transformations |
- |
|
GitHub repository URL for the fork ribbon |
-
Ensure you have installed the dependencies:
pip install -r requirements.txt
-
Set up your
.envfile (see Configuration) -
Run the Streamlit application:
streamlit run app.py
-
The application will open in your browser at
http://localhost:8501
-
Build the Docker image:
docker build -t dynbench-frontend . -
Run the container with environment variables:
docker run -p 8501:8501 \ -e DYNBENCH=http://your-backend-url:port/transform \ -e MODEL=mistral-small \ dynbench-frontend -
The application will be available at
http://localhost:8501
-
Select difficulty: Choose from easy, normal, hard, or random
-
Select language: Pick from 20 supported languages
-
Enter question and query: Provide a natural language question and its corresponding SPARQL query
-
Generate: Click the "Generate" button to transform the pair
-
Review results: Check the transformed question and query
-
Provide feedback: Use the OK/Wrong buttons to give feedback on each result
The application includes sample datasets:
- benchmarks/DynQALD.json - Question Answering Benchmark
- benchmarks/DynRuBQ.json - Russian/English Benchmark
DynBench-Frontend/
├── app.py # Main Streamlit application
├── requirements.txt # Python dependencies
├── Dockerfile # Docker configuration
├── .env # Environment variables
├── css/ # Stylesheets
├── images/ # Application images
├── js/ # JavaScript files
└── benchmarks/ # Benchmark data-
Fork the repository
-
Create a feature branch
-
Commit your changes
-
Push to the branch
-
Open a Pull Request
(top 20 according to Wikipedia)
Position |
Language |
Code |
Native speakers |
Total speakers |
1 |
Russian |
ru |
106,000,000 |
160,000,000 |
2 |
German |
de |
85,000,000 |
130,000,000 |
3 |
French |
fr |
67,000,000 |
112,000,000 |
4 |
English |
en |
67,000,000 |
280,000,000 |
5 |
Italian |
it |
58,000,000 |
72,000,000 |
6 |
Spanish |
es |
40,000,000 |
76,000,000 |
7 |
Polish |
pl |
38,000,000 |
|
8 |
Ukrainian |
uk |
32,600,000 |
|
9 |
Romanian |
ro |
24,000,000 |
28,000,000 |
10 |
Dutch |
nl |
22,000,000 |
24,000,000 |
11 |
Turkish |
tr |
15,752,673 |
|
12 |
Bavarian |
bar |
14,000,000 |
|
13 |
Portuguese |
pt |
11,000,000 |
|
14 |
Hungarian |
hu |
11,000,000 |
|
15 |
Greek |
el |
11,000,000 |
|
16 |
Czech |
cs |
10,600,000 |
|
17 |
Swedish |
sv |
10,000,000 |
11,000,000 |
18 |
Catalan |
ca |
10,000,000 |
|
19 |
Serbian |
sr |
9,000,000 |
|
20 |
Bulgarian |
bg |
7,800,000 |
