Titan AI LogoTitan AI

FlexPrefill

160
9
Python

Project Description

Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference

Project Information

Created on 2/18/2025
Updated on 1/1/2026