DarkGraph is a high-performance graph computing system targeting trillions of parameters DL model training, developed by DataDarkness Lab at University of Science and Technology of China. It takes account of both high availability in industry and innovation in academia.
This is the preview of DarkGraph 1.0, which is still under rapid development. Please raise an issue if you need any help.
We welcome everyone interested in machine learning or graph computing to contribute codes, create issues or pull requests.
TODO
-
Clone the repository.
-
Prepare the environment. We use Anaconda to manage packages. The following command create the conda environment to be used:
conda env create -f environment.yml. Please prepare Cuda toolkit, CuDNN, and gRPC in advance. -
We use CMake to compile DarkGraph. Please copy the example configuration for compilation by
cp cmake/config.example.cmake cmake/config.cmake. Users can modify the configuration file to enable/disable the compilation of each module. For advanced users (who not using the provided conda environment), the prerequisites for different modules in DarkGraph is listed in appendix.
# modify paths and configurations in cmake/config.cmake
# generate Makefile
mkdir build && cd build && cmake ..
# compile
# make hetu, version is specified in cmake/config.cmake
make -j 32- Prepare environment for running. Edit the darkgraph.exp file and set the environment path for python and the path for executable mpirun if necessary (for advanced users not using the provided conda environment). Then execute the command
source darkgraph.exp.
- Email: zezhongding@mail.ustc.edu.cn
- DarkGraph homepage: TODO
- Committers & Contributors
- Contributing to DarkGraph
- Development plan
If you are enterprise users and find DarkGraph is useful in your work, please let us know, and we are glad to add your company logo here.
The entire codebase is under license
We have proposed numerous innovative optimization techniques around the DarkGraph system and published several papers, covering a variety of different model workloads and hardware environments.
- [SIGMOD-24] Zezhong Ding, Yongan Xiang, Shangyou Wang, Xike Xie, S. Kevin Zhou, Play like a Vertex: A Stackelberg Game Approach for Streaming Graph Partitioning, SIGMOD 2024 (CCF-A). [code]
- [TC-25] Zezhong Ding, Deyu Kong, Zhuoxu Zhang, Xike Xie, Jianliang Xu, ClusPar: A Game-Theoretic Approach for Efficient and Scalable Streaming Edge Partitioning, TC 2025 (CCF-A). [code]
- [SIGMOD-25] Yongan Xiang, Zezhong Ding, Rui Guo, Shangyou Wang, Xike Xie, S. Kevin Zhou, Capsule: an Out-of-Core Training Mechanism for Colossal GNNs, SIGMOD 2025 (CCF-A). [code]
- [SIGMOD-26] Rui Guo, Zezhong Ding, Xike Xie, Jianliang Xu, SWIFT: Enabling Large-Scale Temporal Graph Learning on a Single Machine, SIGMOD 2026 (CCF-A).
If you use DarkGraph in a scientific publication, we would appreciate citations to the following papers:
TODO
We learned and borrowed insights from a few open source projects including TinyFlow, autodist, tf.distribute, FlexFlow, Angel, and HETU.
The prerequisites for different modules in Hetu is listed as follows:
- OpenMP (*)
- CMake >= 3.24 (*)
- gRPC 1.6.3 (*)
- CUDA >= 11.8 (*)
- CUDNN >= 8.2 (*)
- MPI >= 4.1 (*)
- NCCL >= 2.19 (*)
- Pybind11 >= 2.6.2 (*)
