I am a member of technical staff at xAI working on inference for Grok models.
I graduated with PhD in Computer Engineering from UIUC advised by Prof. Ravishankar Iyer. My thesis focused on improving the reliability and efficiency of large-scale ML systems. During my PhD, I was fortunate to collaborate with amazing mentors and productionize my research at IBM, Google, and Meta.
Selected Publications
QLM: Queue Management for SLO-oriented Large Language Model Serving.
SoCC 2024.
- Adopted by vLLM (release notes) and ByteDance AIBrix (blog)
- Media Coverage: IBM, ByteDance, Hugging Face
