Paper
Combining enhanced DINO with prototypical networks for self-supervised speaker verification
Published Apr 23, 2025 · Xianmei Wan, Guihua Liao, Ying Lou
0
Citations
0
Influential Citations
Abstract
Training speaker-discriminative and robust speaker verification systems without explicit speaker labels remains a persisting challenge. In this paper, we propose a new self-supervised speaker verification approach, Enhanced DINO with Prototypical Networks (EDPN), which effectively facilitates self-supervised speaker representation learning. EDPN adds the prototypical networks training strategy to the self-distillation framework, integrating the advantages of contrastive learning and non-contrastive learning. By incorporating prototypical networks into the self-supervised framework of the enhanced DINO, it achieves superior performance. A series of experiments conducted on the VoxCeleb datasets demonstrates the efficacy of our self-supervised score normalization algorithm in enhanced DINO framework, leading to state-of-the-art results in self-supervised speaker verification on VoxCeleb.
Enhanced DINO with Prototypical Networks (EDPN) effectively facilitates self-supervised speaker verification, achieving state-of-the-art results on VoxCeleb datasets.
Full text analysis coming soon...