Published on

What Matters In Transformers?

Authors

What Matters In Transformers? is an interesting paper that finds you can actually remove half of the attention layers in LLMs like Llama without noticeably reducing modeling performance.

Image

The concept is relatively simple. The authors delete attention layers, MLP layers, or entire transformer blocks:

  • Removing entire transformer blocks leads to significant performance degradation.

  • Removing MLP layers results in significant performance degradation.

  • Removing attention layers causes almost no performance degradation!

In Llama 2 70B, even if half of the attention layers are deleted (which results in a 48% speed-up), there's only a 2.4% decrease in the model benchmarks. The author also recently added Llama 3 results to the paper, which are similar.

The attention layers were not removed randomly but based on a cosine-based similarity score: If the input and output are very similar, the layer is redundant and can be removed.

This is a super intriguing result and could potentially be combined with various model compression techniques (like pruning and quantization) for compounding effects.

Furthermore, the layers are removed in a one-shot fashion (versus iterative fashion), and no (re)training is required after the removal. However, retraining the model after the removal could potentially even recover some of the lost performance.

Overall, a very simple but very interesting study. It appears there might be lots of computational redundancy in larger architectures.

One big caveat of this study, though, is that the focus is mostly on academic benchmarks (HellaSwag, MMLU, etc.). It's unclear how well the models perform on benchmarks measuring conversational performance.

Author

AiUTOMATING PEOPLE, ABN ASIA was founded by people with deep roots in academia, with work experience in the US, Holland, Hungary, Japan, South Korea, Singapore, and Vietnam. ABN Asia is where academia and technology meet opportunity. With our cutting-edge solutions and competent software development services, we're helping businesses level up and take on the global scene. Our commitment: Faster. Better. More reliable. In most cases: Cheaper as well.

Feel free to reach out to us whenever you require IT services, digital consulting, off-the-shelf software solutions, or if you'd like to send us requests for proposals (RFPs). You can contact us at [email protected]. We're ready to assist you with all your technology needs.

ABNAsia.org

© ABN ASIA

AbnAsia.org Software