Tag

#linear-attention

1 article tagged with "linear-attention"

#inference4 #model-hosting4 #serverless-gpu3 #cuda2 #grace-hopper2 #gpu2 #pricing2 #gpu-cloud2 #reinforcement-learning1 #fine-tuning1 #visual-generation1 #pipeline1 #mamba1 #qwen3.51 #linear-attention1

March 8, 202610 min read

Day-0 Support for the Qwen3.5 Family

What we found inside Qwen3.5's hybrid Mamba-Transformer weights and what it took to make the gated delta rule fast on GH200 — from matrix-valued recurrences to mixed batch state corruption.

inferencemambaqwen3.5+3

Read article