Skip to content

Ruichen0424/I2E

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

15 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

I2E: Real-Time Image-to-Event Conversion for High-Performance Spiking Neural Networks

Paper AAAI 2026 Google Scholar

Hugging Face Hugging Face Hugging Face

YouTube Bilibili


πŸš€ Introduction

This is the official PyTorch implementation of the paper I2E: Real-Time Image-to-Event Conversion for High-Performance Spiking Neural Networks, accepted for Oral Presentation at AAAI 2026.

I2E is a pioneering framework that bridges the data scarcity gap in neuromorphic computing. By simulating microsaccadic eye movements via highly parallelized convolution, I2E converts static images into high-fidelity event streams in real-time (>300x faster than prior methods).

✨ Key Highlights

  • SOTA Performance: Achieves 60.50% top-1 accuracy on Event-based ImageNet.
  • Sim-to-Real Transfer: Pre-training on I2E data enables 92.5% accuracy on real-world CIFAR10-DVS, establishing a new SOTA benchmark.
  • Real-Time Conversion: Uniquely enables on-the-fly data augmentation for deep SNN training.

πŸ“„ Abstract

Spiking neural networks (SNNs) promise highly energy-efficient computing, but their adoption is hindered by a critical scarcity of event-stream data. This work introduces I2E, an algorithmic framework that resolves this bottleneck by converting static images into high-fidelity event streams. By simulating microsaccadic eye movements with a highly parallelized convolution, I2E achieves a conversion speed over 300x faster than prior methods, uniquely enabling on-the-fly data augmentation for SNN training. The framework's effectiveness is demonstrated on large-scale benchmarks. An SNN trained on the generated I2E-ImageNet dataset achieves a state-of-the-art accuracy of 60.50%. Critically, this work establishes a powerful sim-to-real paradigm where pre-training on synthetic I2E data and fine-tuning on the real-world CIFAR10-DVS dataset yields an unprecedented accuracy of 92.5%. This result validates that synthetic event data can serve as a high-fidelity proxy for real sensor data, bridging a long-standing gap in neuromorphic engineering. By providing a scalable solution to the data problem, I2E offers a foundational toolkit for developing high-performance neuromorphic systems. The open-source algorithm and all generated datasets are provided to accelerate research in the field.

πŸ‘οΈ Visualization

Below is the visualization of the conversion process from static RGB images to dynamic event streams. We illustrate the high-fidelity conversion with four examples.

More than 200 additional visualization comparisons can be found in Visualization.md.

Original 1 Converted 1 Original 2 Converted 2
Original 3 Converted 3 Original 4 Converted 4

πŸ“¦ Dataset Catalog

We provide a comprehensive collection of standard benchmarks converted into event streams via the I2E algorithm.

1. Standard Benchmarks (Classification)

Config Name Original Source Resolution $(H, W)$ I2E Ratio Event Rate Samples (Train/Val)
I2E-CIFAR10 CIFAR-10 128 x 128 0.07 5.86% 50k / 10k
I2E-CIFAR100 CIFAR-100 128 x 128 0.07 5.76% 50k / 10k
I2E-ImageNet ILSVRC2012 224 x 224 0.12 6.66% 1.28M / 50k

2. Transfer Learning & Fine-grained

Config Name Original Source Resolution $(H, W)$ I2E Ratio Event Rate Samples
I2E-Caltech101 Caltech-101 224 x 224 0.12 6.25% 8.677k
I2E-Caltech256 Caltech-256 224 x 224 0.12 6.04% 30.607k
I2E-Mini-ImageNet Mini-ImageNet 224 x 224 0.12 6.65% 60k

3. Small Scale / Toy

Config Name Original Source Resolution $(H, W)$ I2E Ratio Event Rate Samples
I2E-MNIST MNIST 32 x 32 0.10 9.56% 60k / 10k
I2E-FashionMNIST Fashion-MNIST 32 x 32 0.15 10.76% 60k / 10k

πŸ”œ Coming Soon: Object Detection and Semantic Segmentation datasets.

Download Links:

  • Hugging Face

  • Baidu Netdisk

πŸš€ Quick Start

You do not need to download any extra scripts. Just copy the code below. It handles the binary unpacking (converting Parquet bytes to PyTorch Tensors) automatically.

import io
import torch
import numpy as np
from datasets import load_dataset
from torch.utils.data import Dataset, DataLoader

# ==================================================================
# 1. Core Decoding Function (Handles the binary packing)
# ==================================================================
def unpack_event_data(item, use_io=True):
    """
    Decodes the custom binary format:
    Header (8 bytes) -> Shape (T, C, H, W) -> Body (Packed Bits)
    """
    if use_io:
        with io.BytesIO(item['data']) as f:
            raw_data = np.load(f)
    else:
        raw_data = np.load(item)
        
    header_size = 4 * 2      # Parse Header (First 8 bytes for 4 uint16 shape values)
    shape_header = raw_data[:header_size].view(np.uint16)
    original_shape = tuple(shape_header) # Returns (T, C, H, W)
    
    packed_body = raw_data[header_size:]    # Parse Body & Bit-unpacking
    unpacked = np.unpackbits(packed_body)
    
    num_elements = np.prod(original_shape)  # Extract valid bits (Handle padding)
    event_flat = unpacked[:num_elements]
    event_data = event_flat.reshape(original_shape).astype(np.float32).copy()
    
    return torch.from_numpy(event_data)

# ==================================================================
# 2. Dataset Wrapper
# ==================================================================
class I2E_Dataset(Dataset):
    def __init__(self, cache_dir, config_name, split='train', transform=None, target_transform=None):
        print(f"πŸš€ Loading {config_name} [{split}] from Hugging Face...")
        self.ds = load_dataset('UESTC-BICS/I2E', config_name, split=split, cache_dir=cache_dir, keep_in_memory=False)
        self.transform = transform
        self.target_transform = target_transform

    def __len__(self):
        return len(self.ds)

    def __getitem__(self, idx):
        item = self.ds[idx]
        event = unpack_event_data(item)
        label = item['label']
        if self.transform:
            event = self.transform(event)
        if self.target_transform:
            label = self.target_transform(label)
        return event, label

# ==================================================================
# 3. Run Example
# ==================================================================
if __name__ == "__main__":
    import os
    os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'     # Use HF mirror server in some regions

    DATASET_NAME = 'I2E-CIFAR10'                            # Choose your config: 'I2E-CIFAR10', 'I2E-ImageNet', etc.
    MODEL_PATH = 'Your cache path here'                     # e.g., './hf_datasets_cache/'
    
    train_dataset = I2E_Dataset(MODEL_PATH, DATASET_NAME, split='train')
    val_dataset = I2E_Dataset(MODEL_PATH, DATASET_NAME, split='validation')

    train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True, num_workers=32, persistent_workers=True)
    val_loader = DataLoader(val_dataset, batch_size=32, shuffle=False, num_workers=32, persistent_workers=True)

    events, labels = next(iter(train_loader))
    print(f"βœ… Loaded Batch Shape: {events.shape}") # Expect: [32, T, 2, H, W]
    print(f"βœ… Labels: {labels}")

πŸ› οΈ Preprocessing Protocol

To ensure reproducibility, we specify the exact data augmentation pipeline applied to the static images before I2E conversion.

The (H, W) in the code below corresponds to the "Resolution" column in the Dataset Catalog above.

from torchvision.transforms import v2

# Standard Pre-processing Pipeline used for I2E generation
transform_train = v2.Compose([
    # Ensure 3-channel RGB (crucial for grayscale datasets like MNIST)
    v2.Lambda(lambda x: x.convert('RGB')),
    v2.PILToTensor(),
    v2.Resize((H, W), interpolation=v2.InterpolationMode.BICUBIC),
    v2.ToDtype(torch.float32, scale=True),
])

πŸ› οΈ Requirements

  • python==3.10
  • pytorch==2.2.0
  • torchvision==0.17.0
  • spikingjelly (dev version between 0.0.0.0.14 and 0.0.0.1.0)
  • timm==1.0.19

Environment Setup

We recommend using Anaconda to create a virtual environment:

conda create -n i2e python=3.10
conda activate i2e

Install PyTorch and dependencies:

# Install PyTorch (Choose based on your CUDA version)
# CUDA 11.8
conda install pytorch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 pytorch-cuda=11.8 -c pytorch -c nvidia

# Install SpikingJelly and timm
pip install timm==1.0.19
pip install spikingjelly

πŸ’» Usage

All training scripts are located in the ./Train Script folder. We provide training code for Baseline-I, Baseline-II, and DVS-CIFAR10, as well as inference code for all provided weights.

Training (Baseline-II)

To train the models using the Baseline-II setting (with full augmentation), use the following commands. Please ensure you update the --dataset_path (or -dp) argument to point to your local dataset location.

CIFAR-10

python train.py -bz 128 -dp '/path/to/CIFAR10/' --dataset 'cifar10' -n 'CIFAR10' -cn 10 -e 256 --lr 0.1 --lr_min 5e-5 -wd 2e-4 --label_smooth 0.1 --model 'resnet18' --ratio 0.07 --shuffle 4 -p 30

CIFAR-100

python train.py -bz 128 -dp '/path/to/CIFAR100/' --dataset 'cifar100' -n 'CIFAR100' -cn 100 -e 256 --lr 0.1 --lr_min 5e-5 -wd 2e-4 --label_smooth 0.1 --model 'resnet18' --ratio 0.07 --shuffle 4 -p 30

ImageNet

python train.py -bz 128 -dp '/path/to/ImageNet/' --dataset 'imagenet' -n 'ImageNet' -cn 1000 -e 128 --lr 0.1 --lr_min 5e-5 -wd 1e-5 --label_smooth 0.1 --model 'resnet18' --ratio 0.12 --shuffle 4 -p 200 --multiprocessing_distributed

πŸ€– Pre-trained Models

We provide pre-trained models for I2E-CIFAR and I2E-ImageNet.

  • Hugging Face
  • Baidu Netdisk

πŸ“Š Main Results & Model Zoo

The experimental settings for the methods listed below are as follows:

  • Baseline-I: Training from scratch with minimal augmentation.
  • Baseline-II: Training from scratch with full augmentation (random crop, etc.), enabled by I2E.
  • Transfer-I: Fine-tuning on target dataset after pre-training on a source dataset.
  • Transfer-II: Fine-tuning on target dataset after pre-training on I2E-CIFAR10.
Dataset Structure Method Top-1 Acc Downloadable
CIFAR10-DVS MS-ResNet18 Baseline 65.6% βœ”
MS-ResNet18 Transfer-I 83.1% βœ”
MS-ResNet18 Transfer-II 92.5% βœ”
I2E-CIFAR10 MS-ResNet18 Baseline-I 85.07% βœ”
MS-ResNet18 Baseline-II 89.23% βœ”
MS-ResNet18 Transfer-I 90.86% βœ”
I2E-CIFAR100 MS-ResNet18 Baseline-I 51.32% βœ”
MS-ResNet18 Baseline-II 60.68% βœ”
MS-ResNet18 Transfer-I 64.53% βœ”
I2E-ImageNet MS-ResNet18 Baseline-I 48.30% βœ”
MS-ResNet18 Baseline-II 57.97% βœ”
MS-ResNet18 Transfer-I 59.28% βœ”
MS-ResNet34 Baseline-II 60.50% βœ”

πŸ“œ Citation

If you find our code useful for your research, or use the I2E algorithm, or use the provided I2E-Datasets, please consider citing:

@inproceedings{ma2026i2e,
  title={I2E: Real-Time Image-to-Event Conversion for High-Performance Spiking Neural Networks},
  author={Ma, Ruichen and Meng, Liwei and Qiao, Guanchao and Ning, Ning and Liu, Yang and Hu, Shaogang},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={40},
  number={3},
  pages={1982--1990},
  year={2026}
}

πŸ–ΌοΈ Poster

poster

About

[AAAI 2026 Oral] Official implementation for "I2E: Real-Time Image-to-Event Conversion for High-Performance Spiking Neural Networks"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors