research/ProjectOverview at main · spatial-ninjas/research · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
Overview
This document collects the Spatial Ninjas literature summaries, synthesis notes, and next steps.

Project OVERVIEW (updated 13/04 2026)

LLM models chosen for this project: GPT and Gemini.
Gemma3 is being used for testing purposes in the Test_LLM_for_routing.py as example, although the code can be used for any LLM. Routingpy Module is being used as Ground Truth while a certain area of Helsinki from Open Street Map (OSM) is chosen as the reference map.

Our Current Script Inventory Our team had developed three scripts that connect raw map data with AI-driven navigation:

Data Processing (network_to_ssal.py): We extract road data from OpenStreetMap (OSM) for Southern Helsinki. To save tokens, we convert complex geographical data into SSAL (Simplified Semantic Adjacency List). SSAL only keeps essentials: Node IDs, street names, lengths, and one-way status.

The Engine (Test_LLM_for_routing.py): This is the algorithm that feeds the SSAL data and a routing prompt to the LLM. It asks the LLM to act as a GPS and output a route in JSON format.

The Interface (app.py): A GUI that allows us to compare OpenAI and Gemini models side-by-side. It persists all test history in a SQLite database (history.db) for later analysis.

Technical Specs & Data Logic To keep the models efficient, we have pruned the OSM attributes:

FeatureLogicSSAL FormatNode: Neighbor {Length, Name, Direction}. Minimalist for token efficiency. Edge FilteringWe keep u, v, name, length, and oneway. We discard speed limits, lane counts, and road types. Node FilteringWe keep osmid and x/y coordinates so the LLM understands "North/South" and relative positions.

Update 15/04 2026

Compare Routes.py was created and succesfully tested. The script takes the JSON output from the GUI after being passed to the LLM, converts the nodes into coordinates, asks the Routingpy algorithm to make the same route and compares both the correct selected nodes as well as distance estimation in percentate.

The following tests were made so far:

Test 1
Find path between Origin Node: 25291537 (Bulewardi) to Destination Node: 313984198 (Bulewardi).

Results:

Gemini 2.5 Flash: Failed to deliver correct format in response.

GPT 5.4 Mini: Node Sequence Accuracy: 66.7%, Distance Precision: 99.1%

Test 2
Find path between Origin Node: 25291537 (Bulewardi) to Destination Node: 313984198 (Bulewardi).

Results:

Gemini 2.5 Flash: Failed to deliver correct format in response.

GPT 5.4: Node Sequence Accuracy: 66.7%, Distance Precision: 99.1%


Test 3
Find path between Origin Node: 25291537 (Bulewardi) to Destination Node: 311112501 (Korkeavuorenkatu).

Results:

Gemini 2.5 Flash: Failed to deliver correct format in response.

GPT 5.4: Node Sequence Accuracy: 8.1%, Distance Precision: 2.9%

Test 4
Find path between Origin Node: 25291564 (Bulewardi) to Destination Node: 25291567 (Yrjonkatu).

Results:

Gemini 2.5 Flash: Failed to deliver correct format in response.

GPT 5.4: Node Sequence Accuracy: 25.0%, Distance Precision: 24.1%

Test 5
Find path between Origin Node: 25291537 (Bulewardi) to Destination Node: 25291550 (Uudenmaankatu).

Results:

Gemini 2.5 Flash: Failed to deliver correct format in response.

GPT 5.4 Mini: Node Sequence Accuracy: 15.8%, Distance Precision: 4.2%

Test 6
Find path between Origin Node: 313984203 (Bulewardi) to Destination Node: 3232013778 (Annankatu).

Results:

Gemini 2.5 Flash: Failed to deliver correct format in response.

GPT 5.4: Failed to deliver correct format in response.

Test 7
Find path between Origin Node: 25291537 (Bulewardi) to Destination Node: 3228745582 (Hietalandenventa).

Results:

Gemini 2.5 Flash: Failed to deliver correct format in response.

GPT 5.4: Failed to deliver correct format in response.

Comment: It appears that when the tests become more difficult the accuracy significantly diminishes even for GPT 5.4. Gemini 2.5 Flash fails everytime to deliver a correct response format and keeps getting stuch with a response looking like this and stops short before delivering the full answer.

response_text"```json { "origin": "25291564", "


Test Update 20/04 2026 (Gemini Thinking Budget: switched off, number of max input tokens varied)

Test 8
Find path between Origin Node: 25291537 (Bulewardi) to Destination Node: 25291550 (Uudenmaankatu). Max Output Tokens: 1024

Results:  Gemini 2.5 Flash: Failed to deliver correct format in response (No JSON found).

GPT 5.4: Failed - Path empty or Origin Node not in network.

Test 9
Find path between Origin Node: 25291564 (Bulewardi) to Destination Node: 25291567 (Yrjonkatu). Max Output Tokens: 1024

Results:  Gemini 2.5 Flash: Failed to deliver correct format in response (No JSON found).

GPT 5.4: Node Sequence Accuracy: 25.0%, Distance Precision: 15.1% (Length: LLM 16.0m | Algorithm 106.0m).

Test 10
Find path between Origin Node: 25291564 (Bulewardi) to Destination Node: 25291567 (Yrjonkatu). Max Output Tokens: 2048

Results: Gemini 2.5 Flash: Failed to deliver correct format in response (No JSON found).

GPT 5.4: Node Sequence Accuracy: 25.0%, Distance Precision: 38.5% (Length: LLM 40.8m | Algorithm 106.0m).

Test 11
Find path between Origin Node: 25291564 (Bulewardi) to Destination Node: 25291567 (Yrjonkatu). Max Output Tokens: 1600

Results:  Gemini 2.5 Flash: Node Sequence Accuracy: 25.0%, Distance Precision: 10.2% (Length: LLM 10.8m | Algorithm 106.0m).

GPT 5.4: Node Sequence Accuracy: 25.0%, Distance Precision: 24.1% (Length: LLM 25.5m | Algorithm 106.0m).

Test 12
Find path between Origin Node: 25291537 (Bulewardi) to Destination Node: 25291550 (Uudenmaankatu). Max Output Tokens: 1600

Results:  Gemini 2.5 Flash: Node Sequence Accuracy: 11.5%, Distance Precision: 26.4% (Length: LLM 69.1m | Algorithm 262.0m).

GPT 5.4: Failed - Path empty or Origin Node not in network.