AI21 Labs introduces Jamba, a groundbreaking AI model that surpasses traditional limits in handling context. Led by Or Dagan, the product lead at AI21 Labs, the company unveils a generative model designed to challenge the notion that large context windows require excessive computational resources. Jamba stands out for its unique blend of transformers and state space models (SSMs), offering a promising solution to the compute-intensive nature of long-context AI models.
Transformers, renowned for their role in complex reasoning tasks, serve as the cornerstone of models like GPT-4 and Google's Gemini. Central to transformers is their attention mechanism, which evaluates the relevance of input data to generate output. Conversely, SSMs integrate features from older AI models like recurrent and convolutional neural networks, presenting a more computationally efficient architecture capable of handling extended data sequences.
Jamba leverages Mamba, an open source SSM model, as part of its core architecture, promising three times the throughput on long contexts compared to transformer-based models of similar sizes. With a capacity to handle up to 140,000 tokens on a single GPU, Jamba sets a new standard for efficiency and performance in text generation and analysis. While its initial release under the Apache 2.0 license emphasizes research over commercial use, AI21 Labs anticipates future iterations with enhanced safety measures and improved performance.