An open-source state-of-the-art LLM accelerator called Llama 2, developed using high level synthesis (HLS) on FPGAs, achieves significant energy reduction and increased inference speeds compared to CPUs and GPUs.

1m read time From arxiv.org
Post cover image

Sort: