I'm a Principal AI Researcher at Together AI and SGLang Core Maintainer. I've initiated and led the end-to-end DeepSeek V3/R1 effort on SGLang — from day-0 support and performance optimization to large-scale EP deployment and GB200 NVL72 integration—driving roadmap, coordination, and execution across community collaborations that pushed the frontier of open-source inference engines. My contributions to AI infrastructure have been recognized by the U.S. government with O-1A and EB-1A extraordinary ability classifications. I am a featured speaker at PyTorch Conference 2025. I have always loved Python as a language, and building a pure-Python LLM inference engine that can run at scale has been one of the most meaningful parts of my work. I hope to share these experiences and lessons with more people in the community.
Yineng Zhang
Presentations
Saturday 5 p.m.–5:30 p.m. in Grand Ballroom A