Titan AI LogoTitan AI

exllama

2,903
219
Python

Project Description

A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.

Project Information

Created on 5/4/2023
Updated on 10/8/2025