A Joint Gesture-Identity Recognition Framework Based on 4D Millimeter-Wave Radar Sensing.
Yifan Wu, Li Wu, Taiyang Hu, Zelong Xiao, Jinyu Zhang, Mengxuan Xiao
Abstract
Open AccessGestures serve as an intuitive and natural medium for conveying human intent and personal identity, offering a convenient, contactless, and privacy-preserving interaction modality for human-computer interaction (HCI) systems. This paper proposes a radar-based multimodal framework for joint gesture and identity recognition, aimed at enhancing performance in radar-based gesture-identity recognition tasks. First, a robust preprocessing and multimodal feature extraction method is introduced, which integrates gesture-range-based valid frame detection with clutter suppression, enabling the extraction of multidimensional gesture features including micro-Doppler maps (MDMs), elevation-time maps (ETMs), and azimuth-time maps (ATMs). Next, a novel Joint Recognition Framework with Cross-Modal Attention Fusion (JRF-CMAF) is proposed, which incorporates Adaptive Rectification Blocks (ARBs) to dynamically leverage the complementary and correlated information across modalities. Extensive experiments were conducted on a custom radar gesture dataset collected from 7 volunteers performing 7 distinct gestures. The proposed JRF-CMAF achieves accuracies of 99.76%, 97.57%, and 96.84% in gesture recognition, identity recognition, and joint recognition tasks, respectively. Compared with conventional gesture recognition approaches and existing radar-based identity recognition methods, it attains the highest overall recognition accuracy.