Titan AI

SageAttention

Name: SageAttention
Rating: 2.2119999999999997 (2424 reviews)
Author: Open Source Community

2,424

226

Cuda

Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.

Project Information

Created on 10/3/2024

Updated on 9/26/2025

SageAttention

Project Description

Project Information