open-source-silicon.dev

*<https://mobile.twitter.com/Tim_Dettmers|Tim Dettmers>*
<https://mobile.twitter.com/Tim_Dettmers|@Tim_Dettmers>

We release LLM.int8(), the first 8-bit inference method that saves 2x memory and does not degrade performance for 175B models by exploiting emergent properties. Read More:

Paper: <https://arxiv.org/abs/2208.07339>
Software: <https://huggingface.co/blog/hf-bitsandbytes-integration><https://t.co/hBuVyQhLqS|…>
Emergence: <https://timdettmers.com/2022/08/17/llm-int8-and-emergent-features/>