Cumulus
Homepage
Docs
←Back to all articles
Tag

#linear-attention

1 article tagged with "linear-attention"

#inference4#model-hosting4#serverless-gpu3#cuda2#grace-hopper2#gpu2#pricing2#gpu-cloud2#reinforcement-learning1#fine-tuning1#visual-generation1#pipeline1#mamba1#qwen3.51#linear-attention1
March 8, 202610 min read

Day-0 Support for the Qwen3.5 Family

What we found inside Qwen3.5's hybrid Mamba-Transformer weights and what it took to make the gated delta rule fast on GH200 — from matrix-valued recurrences to mixed batch state corruption.

inferencemambaqwen3.5+3
Read article

Cumulus Labs

© 2026 Cumulus Compute Labs Corporation. All rights reserved.