about / blog / sources / projects / EN /

Jin Pan

ML Systems / LLM Inference / RL Infrastructure

威斯康星大学麦迪逊分校计算机科学二年级 MS/PhD 学生，研究方向为 ML Systems。SGLang 社区贡献者，目前在 AMD GenAI 团队实习，专注 RL 系统和 GPU 核函数优化。

更多关于我 →

最新文章

Autotune 全流程 · 从 Triton 到 FlyDSL

kernel autotune 在 Triton 里怎么设计、在 aiter / quack / CuteDSL 里怎么用、又怎么被 SGLang 这类推理引擎消费 —— 然后从这一切出发, 给 FlyDSL 设计一条真正的 autotune 路径。配一份带 6 张手绘 SVG 图的 HTML 深读。
长序列 MoE RL 训练：从第一性原理到 MI300X

从第一性原理重新推导 Yan Bai 的长序列 MoE RL 优化（Path B 重计算、 linear cross-entropy、 FSDP2、 chunked expert-parallel overlap），以及它们在 AMD MI300X / MI355X 上各自意味着什么。
FlyDSL 笔记 · Layout 之下的 BasisAttr

上一篇 FlyDSL 精读把 layout 代数收在了五个词上，这篇补的是更深一层的子概念 —— Fly_Basis 类型和 BasisAttr。是什么、为什么 layout 需要、以及 mentor 给我的'完善它'任务从哪条线进入最干净。
从 Python 到硅片 · 给 ML 工程师的编译器与体系结构小科普

你可以把生产级 ML 系统写好几年，都不知道 IR、 MLIR、 LLVM、 ISA、 FFI 这些词指什么。这一篇是补丁 —— 写给本科 CS 念过、但是 Compiler 和 Computer Arch 没好好上过的 ML 工程师。配一份带 6 张 SVG 图、中英双语的 HTML 深读。
注意力机制详解 — Full, Sparse, Linear, NSA & GLA

从 Full Attention 出发，拆解 Sparse 和 Linear 两条路线，直到 DeepSeek NSA 和 Gated Linear Attention

近期项目

查看全部项目

Miles

Enterprise RL framework for LLM/VLM post-training. Integrates SGLang rollout + Megatron training with FP8 pipeline and MoE support.
SpecForge

Train speculative decoding draft models and port them to SGLang serving. Part of the SGLang ecosystem.
TritonForge

LLM-powered GPU kernel synthesis: Train models to convert PyTorch ops into optimized Triton kernels via SFT+RL.
APRIL

Active Partial Rollouts in Reinforcement Learning to Tame Long-tail Generation. A system-level optimization for scalable LLM training.
SGLang

High-performance serving framework for large language models and multimodal models. Contributor.

联系方式

通过社交媒体或邮件联系我。

GitHub /
LinkedIn /
jpan236@wisc.edu

© 2026 • Jinn's Hub 🔬

Press Esc or click anywhere to close