Publications by authors named "Xibang Yang"

Spoken language identification (SLI) relies on detecting key frequency characteristics like pitch, tone, and rhythm. While the short-time Fourier transform (STFT) generates time-frequency acoustic features (TFAF) for deep learning networks (DLNs), rectangular convolution kernels cause frequency mixing and aliasing, degrading feature extraction. We propose filamentary convolution to replace rectangular kernels, reducing the parameters while preserving inter-frame features by focusing solely on frequency patterns.

View Article and Find Full Text PDF