Microsoft Introduces SliceGPT For Compressing LLMs

Nitika Sharma 29 Jan, 2024 • 2 min read

To address the resource-intensive nature of LLMs, Microsoft introduces SliceGPT, a novel sparsification technique designed to compress models without sacrificing performance. Let’s examine the details of this new approach.

What is Special About SliceGPT?

Microsoft’s latest offering, SliceGPT, is an innovative solution for compressing large language models. This technique removes up to 25% of model parameters, including embeddings, while maintaining impressive zero-shot task performance. The reduction in model size signifies a substantial breakthrough in optimizing computing and memory resources.

SliceGPT | Microsoft

Also Read: 8 Microsoft Free Courses- Machine Learning, AI, Data Science & More.

Performance Validation on Diverse Models

A paper released on Hugging Face highlights SliceGPT’s prowess by showcasing its application on LLAMA2-70B, OPT 66B, and Phi-2 models. The results are remarkable, with SliceGPT achieving 99%, 99%, and 90% zero-shot task performance of the dense model, respectively. This demonstrates the versatility of SliceGPT across various language models.

Running Faster on Fewer GPUs

One of SliceGPT’s notable advantages is its efficiency in inference. Sliced models, achieved through sparsification, exhibit accelerated performance on fewer GPUs. The paper indicates a reduction to 64% of the total compute for inference on LLAMA2-70B using 24GB consumer GPUs. Even on 40GB A100 GPUs, the reduction stands at an impressive 66%. This translates to faster execution without the need for additional code optimization.

Core Concept Behind SliceGPT

The abstract explores SliceGPT’s core idea: computational invariance in transformer networks. This revelation offers ways to reduce pre-trained models’ memory and computation requirements. SliceGPT achieves a notable reduction in the network’s embedding dimension by substituting weight matrices with compact, dense counterparts.

Also Read: Microsoft Launches Copilot on Microsoft 365; Introduces Pro Subscription Plan

The Future Path Paved by SliceGPT

Microsoft’s SliceGPT addresses the challenges posed by resource-intensive language models and lays the groundwork for future advancements. With its open-source code available on GitHub, SliceGPT invites collaboration and exploration to reduce pre-trained models’ computational footprints further.

Our Say

As we witness the emergence of SliceGPT, we envision a future where large language models can coexist with optimized resource utilization. Microsoft’s strides in computational invariance pave the way for a more efficient and sustainable era of AI. SliceGPT is a testament to innovation, offering a glimpse into the evolving landscape of language model compression.

You can explore this paper here.

Follow us on Google News to stay updated with the latest innovations in the world of AI, Data Science, & GenAI.

Nitika Sharma 29 Jan 2024

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear