[RTI][SUB] Write the Ingestion Script for RTIs by ChanukaUOJ · Pull Request #5 · LDFLK/RTI-Tracker

ChanukaUOJ · 2026-03-19T08:44:16Z

This PR includes the Ingestion Script to insert RTI data into the databases through OpenGIN. As the first step this covers the changes to insert Nodes and create the relations in the Graph db.

gemini-code-assist · 2026-03-19T08:44:43Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request delivers the foundational ingestion script for the RTI Tracker project, written in Go. Its primary purpose is to automate the population of RTI data into the OpenGIN graph database. The script intelligently processes structured CSV data, creates new RTI entities as nodes, and dynamically links them to existing organizational entities, thereby building a comprehensive relationship graph. This initial step significantly streamlines the data onboarding process for the RTI system.

Highlights

RTI Data Ingestion Script: Introduced a new Go-based ingestion script designed to process RTI (Right to Information) data from CSV files and insert it into the OpenGIN graph database.
Graph Database Integration: The script handles the creation of RTI document nodes and establishes relationships with receiver institutions within the graph database, leveraging OpenGIN's ingestion and read APIs.
Data Processing Logic: Implemented logic to walk through data directories, parse request.csv files, extract relevant RTI request details, and validate input before processing.
API Client and Services: Developed internal API clients and services for interacting with the ingestion and read endpoints of the OpenGIN system, including retry mechanisms and custom error handling.
Testing and Utilities: Included unit tests for the core RTI service logic, a utility for date conversion, and a test_run.sh script for easy execution of tests and the main application.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces an ingestion script for RTI data, which is a great step forward. The code is generally well-structured, separating concerns into different packages. However, I've identified several critical issues related to error handling, resource management, and logic that could lead to runtime panics, silent failures, or data duplication. Key areas for improvement include properly handling errors from file I/O and API calls, ensuring resource cleanup (like closing files), fixing a logical flaw in parent entity lookup, and making the data ingestion process idempotent. Please see my detailed comments for specific suggestions.

ingestion/cmd/app/main.go

ingestion/internals/core/rti_service.go

ingestion/cmd/app/main.go

ingestion/internals/core/rti_service.go

ingestion/tests/rti_service_test.go

ingestion/internals/models/rti_models.go

ingestion/internals/ports/ingestion_service.go

data/Lanka Fabric Ltd/2026-03-18/1/request.csv

data/Lanka Fabric Ltd/2026-03-18/1/status.csv

ingestion/internals/models/models.go

ingestion/internals/models/rti_models.go

ingestion/internals/core/rti_service.go

ingestion/internals/models/rti_models.go

ingestion/internals/core/rti_service.go

zaeema-n · 2026-03-23T06:14:29Z

ingestion/internals/core/rti_service.go

+
+	// 1. Insert the RTI Entity to Graph
+	// Create a deterministic UUID so rerunning the script on the same data doesn't duplicate nodes
+	hashInput := fmt.Sprintf("%s_%s", entity.Created, entity.Index)


If a user tries to create a new rti request with the same created date and accidentally passes an index that already exists, this will give the same id and will overwrite the existing entity.

Index is passed by user and its uniqueness is not enforced anywhere

In the script the Index retrieves from the folder structure. Lets say in one particular date if multiple RTIs send, it should be stored in separate folders, like 1 , 2, 3, 4... like wise. So the use can't create multiple directories with the same name inside one date folder.

ingestion/internals/core/rti_service.go

zaeema-n · 2026-03-23T06:37:06Z

ingestion/internals/core/rti_service.go

+	var parentID string
+	if len(filteredSearchResult) > 0 {
+		sort.Slice(filteredSearchResult, func(i, j int) bool {
+			// Sort in descending order by created date
+			timeI, errI := time.Parse(time.RFC3339, filteredSearchResult[i].Created)
+			timeJ, errJ := time.Parse(time.RFC3339, filteredSearchResult[j].Created)
+			if errI != nil || errJ != nil {
+				return filteredSearchResult[i].Created > filteredSearchResult[j].Created
+			}
+			return timeI.After(timeJ)
+		})
+
+		entityCreatedTime, err := time.Parse(time.RFC3339, entity.Created)
+		if err != nil {
+			return nil, fmt.Errorf("failed time parsing")
+		}
+
+		for _, result := range filteredSearchResult {
+			resultTime, err := time.Parse(time.RFC3339, result.Created)
+			if err == nil && !resultTime.After(entityCreatedTime) {
+				parentID = result.ID
+				break
+			}
+		}
+
+		// Fallback: if no floor date is found or parse failed, pick the first one
+		if parentID == "" {
+			return nil, fmt.Errorf("skipping relation update (receiver not found for the given date): %s", entity.Created)
+		}
+	}


When it comes to ministers and departments we can't rely on the created time of the node, we need to check the created time of the relationship. For each filtered node, you can retrieve all AS_DEPARTMENT/AS_MINISTER (depending on the node's minor kind) relationships that are incoming and then check their active date for which node has a relationship that was active during the rti's created date. If there are multiple nodes with the same name and a relationship active on that date then return an error.

ingestion/internals/core/rti_service.go

ingestion/cmd/app/main.go

ingestion/parse_test.go

ChanukaUOJ added 2 commits March 18, 2026 23:01

ingestion script added for node and relation creation for RTI

9fe83de

test cases added and test data refactored

d3ee700

gemini-code-assist bot reviewed Mar 19, 2026

View reviewed changes

ChanukaUOJ added 2 commits March 19, 2026 14:34

bot commit addressed

717a3d5

entity creation conflict handled

78e5267

zaeema-n reviewed Mar 20, 2026

View reviewed changes

data/Lanka Fabric Ltd/2026-03-18/1/request.csv Show resolved Hide resolved

zaeema-n reviewed Mar 20, 2026

View reviewed changes

data/Lanka Fabric Ltd/2026-03-18/1/status.csv Show resolved Hide resolved

zaeema-n reviewed Mar 20, 2026

View reviewed changes

data/Lanka Fabric Ltd/2026-03-18/1/status.csv Show resolved Hide resolved

zaeema-n reviewed Mar 20, 2026

View reviewed changes

data/Lanka Fabric Ltd/2026-03-18/1/status.csv Show resolved Hide resolved