Code repository of "Transferable Human Mobility Network Reconstruction with neuroGravity, 2025."
NeuroGravity is a physics-informed deep learning model that combines the gravity model's interpretability with Graph neural networks' adaptability. NeuroGravity generates robust mobility flow estimates from limited observations and can be transferred to cities without observation.
- Tasks and Challanges
- Repository Contents
- Getting Started
- NeuroGravity Estimated Mobility Networks
- Collaborators
Note on Data Preprocessing & Reproducibility: The computational pipeline for preprocessing raw Call Detail Records (CDRs) to generate the training data is not included in this repository. Please refer to repository sparkmobility if you posses raw CDR records and wish to construct the OD matrix from scratch.
In most undeveloped regions, the absence of comprehensive travel surveys or mobility data triggers a long-standing need to reconstruct human mobility networks from limited data.
- Reconstruct Mobility Network from Partial Observation.
- Completely random observation (
mask_mode=0) - Observation from/to partial nodes (
mask_mode=1) - Internal observation (
mask_mode=2)
- Completely random observation (
We aim at tackling extreme data-scarce scenarios such that only the internal flows from 10% of regions (accouting for less than 1% of total regions pairs) are observed.
- Cross-City Mobility Flow Generation.
- To ensure applicability, we only use widely available data as model inputs, namely:
population,bult-environment features, anddistance. - The spatial segregation of income would block mobility flows and impact the cross-city flow generation accuracy. Cities with relatively minor spatial income segregation like Boston are normally better choices as source to train flow reconstruction models.
- To ensure applicability, we only use widely available data as model inputs, namely:
Leveraging the outstanding generalizability of neuroGravity, we reconstructed human mobility networks for over 1,200 cities on a global scale.
- Estimate Regional Socio-economic and Livability Indicators
- Node embeddings extracted by neuroGravity during flow reconstruction can help predict regional attributes such as
household income,education attainment,household carbon footprint,nitrogen dioxide concentration, andradius of gyration.
- Node embeddings extracted by neuroGravity during flow reconstruction can help predict regional attributes such as
This repository contains:
- Codebase of neuroGravity: Implementation of neuroGravity, including training, reconstruction, and cross-city generation scripts.
- Trained neuroGravity checkpoints: 20 neuroGravity checkpoints independently trained from sub-sampled Boston mobility network, along with a trained connection predictor, ready to be applied to generate flows on other cities.
- Scripts for Data Preparation: Scripts to prepare OSM feature & population data for neuroGravity inference.
- Socio-economic Inference: Jupyter notebook for socio-economic inference cross-validation.
- Spatial Income Segregation Index: Python code for calculating the spatial income segregation index
- Sample Data:
- Sample data for mobility flow reconstruction & transfer (Since the CDR/LBS data is proprietary, we only provid neuroGravity-generated synthetic flow data here to demonstrate how to use the neuroGravity model).
- Sample data for socio-economic inference (Boston).
- Baseline Models:
| Baseline Model | Description |
|---|---|
| Gravity model | Erlander, Sven, and Neil F. Stewart. The gravity model in transportation analysis: theory and extensions. Vol. 3. Vsp, 1990. |
| meta-Gravity | The deep-parameterized Gravity model, referred to as meta-Gravity, is tested independently as a baseline. |
| Deep Gravity | Simini, Filippo, Gianni Barlacchi, Massimilano Luca, and Luca Pappalardo. "A deep gravity model for mobility flows generation." Nature communications 12, no. 1 (2021): 6576. |
| GNN Model | Zhang, Jiawei, Haopeng Zhang, Congying Xia, and Li Sun. "Graph-bert: Only attention is needed for learning graph representations." arXiv preprint arXiv:2001.05140 (2020). |
-
Python 3.8 or higher
-
Required libraries: See requirements.txt
-
Installation guidance: Installing directly from
requirements.txtmay cause unexpected issues due to version conflicts. It is recommended to install the required libraries individually:- PyTorch: Follow the instructions at PyTorch Installation
- PyTorch Geometric (PYG): Follow the instructions at PYG Installation
Note that the CDR data is proprietary, we only provid neuroGravity-generated synthetic flow data here to demoonstrate how to run the code.
- Reconstruct PORTO mobility network with internal flow observation from 10% regions (along with baseline models)
python main.py reconstructionExpected Runtime < 5 minutes (with GPU acceleration). Output Directory:
./data/PORTO/experiments/reconstruction_porto/. Key output files: flow estimates{models}_reconstruction_test_outputs.csvalong with checkpoints and logs (evaluation metrics).
- Train neuroGravity ensemble model in Boston and generate PORTO mobility network (along with baseline models)
python main.py train_transferExpected Runtime < 10 minutes (with GPU acceleration). Output Directory:
./data/BOSTON/experiments/train_and_transfer_from_boston_to_porto/. Key output files: flow estimates{models}_estimation_in_PORTO.csvalong with checkpoints and logs (evaluation metrics).
The
reconstructionandtransferspecifications direct the code to load the corresponding subparsers from./data/reconstruction_subparser.pyand./data/train_and_transfer_subparser.py, respectively.
-
Prepare and process
division boundary,OSM feature,populationdata for the target cities following codes in./data_preparation/:- Generate target city list along with paths to essential data by
./data_preparation/0_city_list_prepare.ipynb - Process and save target city data by
./data_preparation/1_paralleled_prepare_with_memory_monitor.py - Essential data required for mobility network generation:
- OSM and population profile:
attr_df.csv - Administrative division boundary (polygon) geojson:
{CITY_NAME}.geojson - OD pair distance (Km) and flow (if the ground truth label is available):
distance(_and_flow).csv
Note that the order of administrative divisions in
attr_df.csvis consistent with that in{CITY_NAME}.geojson, and their indices (starting from 0) correspond to the O and D identifiers indistance.csv. - OSM and population profile:
- Generate target city list along with paths to essential data by
-
Prepare targets json file that specify the target city name and path.
-
[Optional] Modify the default path for the
transfer_targetsargument in ./data/transfer_subparser.py if you assigned a new name or path for your new targets json file in step 2. -
Execute the following command and check the target data folder for flow estimation.
python main.py transferLeveraging the generalizability of neuroGravity, we applied the assembling neuroGravity model trained on the data-abundant city Boston to generate mobility flows for over 1,200 cities and regions on a global scale. We released the estimated flows along with the OSM features and administrative divisions in neuroGravity Mobility Netowrk Estimations.
![]() Jinming Yang |
![]() Shaoyu Huang |
![]() Zongyuan Huang |
Supervisors
![]() Yanyan Xu |
![]() Marta C. González |








