News
Newest
Ask
Show
Jobs
Built with Nuxt.js
Decoupling Compute and Memory for Async GPUs
8 points | by
yiyingzhang
2 days ago
4 comments
bobbyzhu2008
2 days ago
67% less kernel code is the more interesting number here — Hopper's async capabilities have been underutilized largely because the programming model is painful. Curious how it handles cases where compute and memory phases aren't cleanly separable.
jhap
2 days ago
This seems like a better version of CUDA, for Hopper GPUs?
preetham_rangu
1 day ago
[dead]
jackofficial643
1 day ago
[flagged]
4 comments