Titan AI LogoTitan AI

neural-compressor

2,499
282
Python

Project Description

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Project Information

Created on 7/21/2020
Updated on 9/26/2025