performance
Performance agilob 6 months ago 90%

LLaMA Now Goes Faster on CPUs

justine.lol

My kernels go 2x faster than MKL for matrices that fit in L2 cache, which makes them a work in progress, since the speedup works best for prompts having fewer than 1,000 tokens.

17
0
Comments 0