marlin

Name: marlin
Rating: 1.4705 (941 reviews)
Author: Open Source Community

941

Python

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Created on 1/17/2024

Updated on 11/11/2025

Project Description