Jinn's Hub
about / blog / projects / ZH /
All tags

Posts tagged with "CDNA3"

    Source Reading 006 — FlyDSL, A Layout-Algebra Python DSL with an MLIR Spine
    AMD's FlyDSL is the Python front-end for a Fly-dialect MLIR compiler that lowers layout algebra and copy/MMA atoms to ROCDL on CDNA3/CDNA4. Four examples — vectorAdd, tiledCopy, tiledMma, preshuffle GEMM — form a strict pedagogical ladder; reading them in order gives you every machinery that real production kernels (paged attention, MoE GEMM, flash attention) recombine.
    Source Reading 005 — GCNasm, Sixty-Four Katas for the AMD ISA Manual You Never Finished
    carlushuang's gcnasm repo is the rare middle ground between HIP tutorials and the 1,200-page CDNA3 ISA manual: 64 short, self-contained kernels that show what hand-tuned AMD code actually looks like. Six hours of careful reading buys you a working mental model of MFMA, vmcnt pipelining, DPP cross-lane primitives, and the trick the LLVM assembler refuses to let you write.
© 2026 • Jinn's Hub 🔬
Press Esc or click anywhere to close