Articles

Feature stories, news review, opinion & commentary on Artificial Intelligence

Reka Unveils New Series of Multimodal Language Models, Setting New Industry Benchmarks

GPT-4


In a recent release, Reka has introduced a trio of advanced multimodal language models: Reka Core, Flash, and Edge. These models are trained to process and reason across different types of inputs including text, images, video, and audio.

Reka Flash and Edge, possessing 21 billion and 7 billion parameters respectively, have been demonstrated as state-of-the-art within their compute classes. They surpass many larger models in efficiency and performance. On the other hand, Reka Core, the most capable among them, is currently competing closely with some of the top models from major players like OpenAI, Google, and Anthropic.

The evaluations reveal that Reka Core excels in various benchmarks. For instance, in image question answering, it performs comparably to OpenAI's GPT4-V and exceeds Claude 3 Opus models in multimodal chat settings, as per third-party blind evaluations. Moreover, in language benchmarks, Core outperforms other advanced models, showing competitive results in areas such as MMLU and GSM8K, even surpassing GPT4-0613 in human evaluations.

The video question answering capabilities of Reka Core have also been noted, where it surpasses Google's Gemini Ultra. Reka Edge, while the smallest in parameter count, impressively outdoes other models in its class, like Gemma 7B and Mistral 7B, proving its prowess in managing dense models efficiently.

These models have been developed with a comprehensive training regimen involving a vast range of data, including a significant portion dedicated to STEM and coding-related content. The training utilized a mix of publicly available and proprietary data, ensuring a rich and diverse dataset for model education.

Further insights into the models' capabilities include their language versatility, with training data covering 32 languages, and their architectural design, which is based on modern transformer techniques such as SwiGLU and Rotary positional embeddings.

Reka's models are available in production at various online platforms, including a developer platform and a chat app, ensuring easy accessibility for further testing and integration by users and developers.

This release positions Reka as a formidable contender in the rapidly evolving domain of multimodal language models, marking significant advancements in AI capabilities and setting new standards for model performance and efficiency.