Welcome to Yiwen Wang’s personal website.

PhD Candidate, EECS, Peking University

Advisor: Xihong Wu, Tianshu Qu

Email: pku_wyw@pku.edu.cn

Scholar: Google Scholar

Biography

Sep. 2022 - July. 2025

Advised by Xihong Wu

Ph. D, Speech and Hearing Research Center, School of Intelligence Science and Technology, Peking University.

Sep. 2019 - July. 2022

Advised by Tianshu Qu

Master, Speech and Hearing Research Center, School of Intelligence Science and Technology, Peking University.

Sep. 2015 - July. 2019

Bachelor’s student, Electronics Engineering and Computer Science, Peking University.

Research Interests

speech enhancement and universal sound separation
direction of arrival estimation
sound field analysis
higher order ambisonic analysis

Publications

Wang, Y., Yuan, Z., & Wu, X. (2024). DENSE: Dynamic Embedding Causal Target Speech Extraction. arXiv preprint arXiv:2409.06136.[Demo] [Code]
Wang, Y., Wu, X. (2024) TSE-PI: Target Sound Extraction under Reverberant Environments with Pitch Information. Proc. Interspeech 2024, 602-606, doi: 10.21437/Interspeech.2024-197. [Demo] [Code]
Wang, Y., Lan, Z., Wu, X. and Qu, T., 2023, June. TT-Net: Dual-path transformer based sound field translation in the spherical harmonic domain. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1-5). IEEE.
Wang, Y., Wu, X. and Qu, T., 2022, May. Up-wgan: Upscaling ambisonic sound scenes using wasserstein generative adversarial networks. In Audio Engineering Society Convention 152. Audio Engineering Society.
Wang, Y., Wu, X. and Qu, T., 2020, May. Direction of arrival estimation based on transfer function learning using autoencoder network. In Audio Engineering Society Convention 148. Audio Engineering Society.
Wu, D., Wang, Y., Wu, X., & Qu, T. (2024). Cross-attention Inspired Selective State Space Models for Target Sound Extraction. arXiv preprint arXiv:2409.04803.
Li, X., Wang, Y., Sun, Y., Wu, X. and Chen, J., 2023, June. PGSS: pitch-guided speech separation. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 37, No. 11, pp. 13130-13138).
Ma, D., Wang, Y., He, L., Jin, M., Su, D. and Yu, D., 2022, May. DP-DWA: Dual-Path Dynamic Weight Attention Network With Streaming Dfsmn-San For Automatic Speech Recognition. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 7692-7696). IEEE.
Peng, C., Wang, Y., Wu, X. and Qu, T., 2022, November. A Multi-channel Speech Separation System for Unknown Number of Multiple Speakers. In 2022 5th International Conference on Information Communication and Signal Processing (ICICSP) (pp. 158-162). IEEE.

Intern Experiments

Sep. 2020 - August. 2021

Tencent AI Lab, Speech Group II, Beijing, China.

Speech recognition, speech enhancement

Mar. 2020 - Jun. 2020

Beijing Momo Technology

Pitch extraction, autotune

Feb. 2019 - August. 2019

Bytedance, Beijing, China

Recommendation Algorithm Intern