Yiwen Wang

Follow me on GitHub

Welcome to Yiwen Wang’s personal website.

PhD Candidate, EECS, Peking University

Advisor: Xihong Wu, Tianshu Qu

Email: pku_wyw@pku.edu.cn

Scholar: Google Scholar

王奕文-证件照

Biography

Sep. 2022 - July. 2025

Advised by Xihong Wu

Ph. D, Speech and Hearing Research Center, School of Intelligence Science and Technology, Peking University.

Sep. 2019 - July. 2022

Advised by Tianshu Qu

Master, Speech and Hearing Research Center, School of Intelligence Science and Technology, Peking University.

Sep. 2015 - July. 2019

Bachelor’s student, Electronics Engineering and Computer Science, Peking University.

Research Interests

  • speech enhancement and universal sound separation

  • direction of arrival estimation

  • sound field analysis

  • higher order ambisonic analysis

Publications

  • Wang, Y., Yuan, Z., & Wu, X. (2024). DENSE: Dynamic Embedding Causal Target Speech Extraction. arXiv preprint arXiv:2409.06136.[Demo] [Code]

  • Wang, Y., Wu, X. (2024) TSE-PI: Target Sound Extraction under Reverberant Environments with Pitch Information. Proc. Interspeech 2024, 602-606, doi: 10.21437/Interspeech.2024-197. [Demo] [Code]

  • Wang, Y., Lan, Z., Wu, X. and Qu, T., 2023, June. TT-Net: Dual-path transformer based sound field translation in the spherical harmonic domain. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1-5). IEEE.

  • Wang, Y., Wu, X. and Qu, T., 2022, May. Up-wgan: Upscaling ambisonic sound scenes using wasserstein generative adversarial networks. In Audio Engineering Society Convention 152. Audio Engineering Society.

  • Wang, Y., Wu, X. and Qu, T., 2020, May. Direction of arrival estimation based on transfer function learning using autoencoder network. In Audio Engineering Society Convention 148. Audio Engineering Society.

  • Wu, D., Wang, Y., Wu, X., & Qu, T. (2024). Cross-attention Inspired Selective State Space Models for Target Sound Extraction. arXiv preprint arXiv:2409.04803.

  • Li, X., Wang, Y., Sun, Y., Wu, X. and Chen, J., 2023, June. PGSS: pitch-guided speech separation. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 37, No. 11, pp. 13130-13138).

  • Ma, D., Wang, Y., He, L., Jin, M., Su, D. and Yu, D., 2022, May. DP-DWA: Dual-Path Dynamic Weight Attention Network With Streaming Dfsmn-San For Automatic Speech Recognition. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 7692-7696). IEEE.

  • Peng, C., Wang, Y., Wu, X. and Qu, T., 2022, November. A Multi-channel Speech Separation System for Unknown Number of Multiple Speakers. In 2022 5th International Conference on Information Communication and Signal Processing (ICICSP) (pp. 158-162). IEEE.

Intern Experiments

Sep. 2020 - August. 2021

Tencent AI Lab, Speech Group II, Beijing, China.

Speech recognition, speech enhancement

Mar. 2020 - Jun. 2020

Beijing Momo Technology

Pitch extraction, autotune

Feb. 2019 - August. 2019

Bytedance, Beijing, China

Recommendation Algorithm Intern