🎯
Focusing
ML researcher in LLM inference efficiency and compression. Founder @ Entropy
Pinned Loading
-
kv-cache-compression
kv-cache-compression PublicCompression of KV cache using Singular Value Decomposition and 4-bit quantization
Python
-
llmlingua2
llmlingua2 PublicMy contributions to LLMLingua-2 prompt compression: Domain-awareness, soft scoring, and round trip reconstruction.
Python 1
-
qosmic-audit
qosmic-audit PublicBuilding an agent to optimize e-commerce consumer conversion + auto improvement.
Python
-
AI-coding-workflow
AI-coding-workflow PublicThe most efficient way to implement agentic coding into your stack.
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.