Do Microfinance Institutions Prioritise Need? Evidence from Loan Allocation and Repeat Borrowing Patterns
Replication files & code for an ECO225 course project.
This repository contains the paper PDF, conference presentation PDF, code, and supporting files, for a regression and machine learning analysis of microloans from Kiva.org. The project aims to use loan descriptions classified into low-need and high-need loans to investigate whether microfinance institutions allocate loan capital in proportion to borrower financial need, and whether repeated borrowing grows or shrinks.
Main File:
ECO225_Code.ipynb– A Jupyter Notebook containing:- Data cleaning and preprocessing
- Visualisations of key trends
- Feature engineering
- Machine learning pipeline for loan classification
- Regression analysis of loans and disbursement amounts
To Replicate:
- Download the required datasets (see below)
- Place them in the appropriate folder
- Open the notebook and run all cells sequentially
External (must be downloaded):
📥 Kiva Kaggle Dataset
Required files:
loans.csv– Primary loan-level dataloan_coords.csv– Geographical coordinates for each loan
Included in this repository:
gdp_data.csv– Country-level GDPmpi_data.csv– Multidimensional Poverty Indexllm_subsample.csv– Subsample of 2,000 LLM-labeled loans used in classification tasksECO225_Paper.pdf– Paper PDF for the projectTech-Econference_Presentation.pdf– Presentation PDF for the project at the 5th Annual Econ-Tech Conference (2025), University of Toronto
- Python 3.8+
- Jupyter Notebook
- Core packages used:
pandas,numpy,sklearn,matplotlib,seaborn
Due to GitHub’s file size limits, only essential supporting files are included. For full replication, refer to the Kaggle dataset linked above.