Weihao Cui | 崔炜皞
Xtra Computing Group, National University of Singapore
Currently, I am a postdoc research fellow working with Prof. Bingsheng He in National University of Singapore. I also work closely with Prof. Minyi Guo, Prof. Quan Chen and Dr. Han Zhao.
I obtained my Ph.D. degree at Department of Computer Science and Engineering (CSE), Shanghai Jiao Tong University, China, supervised by Prof. Quan Chen on AI System and Cloud Computing.
Feel free to contact me via weihao DOT tsui AT gmail DOT com.
News
| Feb 24, 2026 | One paper accepted to SIGMOD 2026. |
|---|---|
| Jan 31, 2026 | One paper accepted to EuroSys 2026. |
| Jan 17, 2026 | PD-Multiplexing has been accepted by ASPLOS 2026. |
| Dec 18, 2025 | Honored to be selected for the CCF Doctoral Dissertation Incentive Program 2025. |
| Dec 10, 2025 | Two papers accepted to NSDI 2026. |
| Nov 08, 2025 | One paper accepted to HPCA 2026. |
| Oct 15, 2025 | Serving as the Web Chair for ICPP 2026. Submission details are available in the Call for Papers. |
| Sep 28, 2025 | PD-Multiplexing has been merged into SGLang. |
Selected publications
- arXivEfficient Function-as-a-Service for Large Language Models with TIDALarXiv preprint arXiv:2503.06421, 2025
- ASPLOS ’26Towards High-Goodput LLM Serving with Prefill-decode MultiplexingIn Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2026
- NSDI ’26Flare: Anomaly Diagnostics for Divergent LLM Training in GPU Clusters of Thousand-Plus ScaleIn Proceedings of the 23rd USENIX Symposium on Networked Systems Design and Implementation, 2026
- OSDI ’23Optimizing Dynamic Neural Networks with BrainstormIn 17th USENIX Symposium on Operating Systems Design and Implementation, 2023
- ATC ’22DVABatch: Diversity-aware Multi-Entry Multi-Exit Batching for Efficient Processing of DNN Services on GPUsIn 2022 USENIX Annual Technical Conference, 2022
- SC ’21Enable Simultaneous DNN Services Based on Deterministic Operator Overlap and Precise Latency PredictionIn Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2021