Deep Physiological-Behavioral Representation Learning for Video-Based Hand Gesture Authentication

Pytorch Implementation of paper:

Deep Physiological-Behavioral Representation Learning for Video-Based Hand Gesture Authentication

Wenwei Song, Xiaorong Gao, Yufeng Zhang, Jinlong Li, Wenxiong Kang, and Zhixue Wang*.

Main Contribution

Dynamic hand gestures encode rich physiological and behavioral characteristics, providing a promising biometric trait for reliable authentication. Existing studies primarily improve video-based gesture authentication by designing network architectures, constructing behavioral pseudo-modalities, and optimizing loss functions. Following this paradigm, PB-Net adopts a decoupled analysis and complementary fusion strategy for the two characteristics, achieving competitive performance. However, its modeling of fine-grained identity characteristics remains limited. In this work, we revisit PB-Net and propose PB-Net v2 by rethinking the modeling requirements of physiological and behavioral characteristics. Specifically, we refine the data-tailoring strategy, including behavioral pseudo-modality design, to reduce redundancy while preserving richer identity information. We then enhance the physiological and behavioral branches to extract more complementary spatiotemporal physiological features and more stable behavioral representations, respectively. Moreover, we improve the feature fusion module to mitigate branch-specific bias while facilitating reliability-aware feature fusion. Extensive experiments on the SCUT-DHGA dataset demonstrate the effectiveness of the proposed improvements. PB-Net v2 consistently achieves the lowest equal error rates among 21 state-of-the-art models under four evaluation protocols.

Overall architecture of PB-Net v2. C1 denotes the Conv1 layer of ResNet, and LxBy indicates the y-th Block in the x-th ResNet Layer. TC represents Temporal Convolution, while TM denotes Temporal Max Pooling. Norm indicates L2 normalization. $\mathcal{L}_1$ to $\mathcal{L}_3$ correspond to three AMSoftmax loss functions.

Comparisons with SOTAs

To comprehensively evaluate the effectiveness of PB-Net v2, we compare it with 21 SOTA video understanding models on the SCUT-DHGA dataset. The performance of some representative models are shown in the following figure. The EERs shown in the figure are all average values over six test configurations on the cross session.

Comparison of PB-Net v2 with representative models under MG and UMG protocols using the AMSoftmax loss. The values at the bottom and top of each bar indicate the MG and UMG EERs, respectively.

Dependencies

Please make sure the following libraries are installed successfully:

PyTorch >= 2.2.2

How to use

This repository is a demo of PB-Net-v2. Through debugging (main.py), you can quickly understand the configuration and building method of PB-Net-v2.

If you want to explore the entire dynamic hand gesture authentication framework, please refer to our pervious work SCUT-DHGA.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
img		img
loss		loss
model		model
LICENSE		LICENSE
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Physiological-Behavioral Representation Learning for Video-Based Hand Gesture Authentication

Main Contribution

Comparisons with SOTAs

Dependencies

How to use

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Deep Physiological-Behavioral Representation Learning for Video-Based Hand Gesture Authentication

Main Contribution

Comparisons with SOTAs

Dependencies

How to use

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages