Welcome to Yiwen Wang’s personal website.
PhD Candidate, EECS, Peking University
Advisor: Xihong Wu, Tianshu Qu
Email: pku_wyw@pku.edu.cn
Scholar: Google Scholar
Biography
Sep. 2022 - July. 2025
Advised by Xihong Wu
Ph. D, Speech and Hearing Research Center, School of Intelligence Science and Technology, Peking University.
Sep. 2019 - July. 2022
Advised by Tianshu Qu
Master, Speech and Hearing Research Center, School of Intelligence Science and Technology, Peking University.
Sep. 2015 - July. 2019
Bachelor’s student, Electronics Engineering and Computer Science, Peking University.
Research Interests
-
speech enhancement and universal sound separation
-
direction of arrival estimation
-
sound field analysis
-
higher order ambisonic analysis
Publications
-
Wang, Y., Yuan, Z., & Wu, X. (2024). DENSE: Dynamic Embedding Causal Target Speech Extraction. arXiv preprint arXiv:2409.06136.[Demo] [Code]
-
Wang, Y., Wu, X. (2024) TSE-PI: Target Sound Extraction under Reverberant Environments with Pitch Information. Proc. Interspeech 2024, 602-606, doi: 10.21437/Interspeech.2024-197. [Demo] [Code]
-
Wang, Y., Lan, Z., Wu, X. and Qu, T., 2023, June. TT-Net: Dual-path transformer based sound field translation in the spherical harmonic domain. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1-5). IEEE.
-
Wang, Y., Wu, X. and Qu, T., 2022, May. Up-wgan: Upscaling ambisonic sound scenes using wasserstein generative adversarial networks. In Audio Engineering Society Convention 152. Audio Engineering Society.
-
Wang, Y., Wu, X. and Qu, T., 2020, May. Direction of arrival estimation based on transfer function learning using autoencoder network. In Audio Engineering Society Convention 148. Audio Engineering Society.
-
Wu, D., Wang, Y., Wu, X., & Qu, T. (2024). Cross-attention Inspired Selective State Space Models for Target Sound Extraction. arXiv preprint arXiv:2409.04803.
-
Li, X., Wang, Y., Sun, Y., Wu, X. and Chen, J., 2023, June. PGSS: pitch-guided speech separation. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 37, No. 11, pp. 13130-13138).
-
Ma, D., Wang, Y., He, L., Jin, M., Su, D. and Yu, D., 2022, May. DP-DWA: Dual-Path Dynamic Weight Attention Network With Streaming Dfsmn-San For Automatic Speech Recognition. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 7692-7696). IEEE.
-
Peng, C., Wang, Y., Wu, X. and Qu, T., 2022, November. A Multi-channel Speech Separation System for Unknown Number of Multiple Speakers. In 2022 5th International Conference on Information Communication and Signal Processing (ICICSP) (pp. 158-162). IEEE.
Intern Experiments
Sep. 2020 - August. 2021
Tencent AI Lab, Speech Group II, Beijing, China.
Speech recognition, speech enhancement
Mar. 2020 - Jun. 2020
Beijing Momo Technology
Pitch extraction, autotune
Feb. 2019 - August. 2019
Bytedance, Beijing, China
Recommendation Algorithm Intern