Pinned Loading
-
content-radar
content-radar PublicCollect trending AI/dev signal from Hacker News, arXiv, GitHub Trending, Reddit, and X, then synthesize review-ready post drafts with Claude.
Python 2
-
gh-radar
gh-radar PublicDaily email digest of trending GitHub tools — GitHub Trending + Hacker News + new-repo search, no X API. Runs on GitHub Actions.
Python
-
tensor-core-from-scratch
tensor-core-from-scratch PublicFrom naive matmul to tensor cores on NVIDIA Blackwell — step by step. 8 self-contained CUDA kernels, each benchmarked against cuBLAS.
-
blackwell-tensorcore-kernels
blackwell-tensorcore-kernels PublicHand-written CUDA Tensor Core GEMM kernels on Blackwell (sm_120) and Hopper (sm_90) — raw mma.sync reaching 106% of the cuBLAS-TC kernel on sm_120, CUTLASS 3.x wgmma at 85.5% of nvjet on H100, and …
Cuda
-
trtllm-triton-serving
trtllm-triton-serving PublicTensorRT-LLM vs vLLM controlled head-to-head on H100 — 12 studies including a knob-by-knob waterfall reproducing NVIDIA's published 27.7k tok/s (100.3%) and attributing the gap to real serving, plu…
Python
If the problem persists, check the GitHub status page or contact support.


