Databricks widened Mosaic AI to aid businesses in using advanced AI models

Kiera Collins

3 weeks ago

Databricks widened Mosaic AI to aid businesses in using advanced AI models

Read Time:5 Minute, 23 Second

After changing the name of MosaicML to Mosaic AI, Databricks unveiled five new Mosaic AI technologies at its Data and AI Summit. With changing needs and the advanced use of large language models in production, these technologies aim to enhance model quality, cost-effectiveness, and data privacy.

Databricks purchased MosaicML for $1.3 billion a year ago. Databricks’ AI offerings now heavily rely on the platform, rebranded as Mosaic AI.

The business is introducing many new features for the service today at its Data and AI Summit. I had a conversation with Matei Zaharia, CTO, and CEO Ali Ghodsi of Databricks before the announcements.

The Mosaic AI Agent Framework, Mosaic AI Agent Evaluation, Mosaic AI Technologies Catalogue, Mosaic AI Model Training, and Mosaic AI Gateway are the five new Mosaic AI technologies that Databricks is introducing at their conference.

“GenAI has made incredible strides this year. It’s exciting for everyone, Ghodsi informed me. However, there are still three things that matter to everyone: How can we improve these models’ quality or dependability? Secondly, what steps can we take to ensure that it is economical? And the price gap between these devices is enormous—it’s several orders of magnitude. Thirdly, how can we accomplish it while maintaining the confidentiality of our data?

Today’s releases aim to address the majority of these issues for Databricks customers. Additionally, Zaharia pointed out that the businesses now implementing large language models (LLMs) in production are doing so using multi-component systems.

This usually means that they use a range of external tools to access databases or perform retrieval augmented generation (RAG), and they make multiple calls to a model (or possibly more than one model, too).

These compound systems improve LLM-based applications’ speed, save costs by employing less expensive models for specific queries or caching results, and perhaps most importantly improve the results’ reliability and relevance by adding proprietary data to the foundation models.

He clarified, “We believe that is the direction of really high-impact, mission-critical AI applications in the future.” Engineers should have complete control over every aspect of their work, which a modular system enables.

Therefore, to make it easier for developers to work with these systems, hook up all the components, trace everything through, and observe what’s occurring, we are conducting a tremendous deal of basic research on how to design these [systems] for a certain task.

In terms of the actual development of these systems, Databricks is introducing the Mosaic AI Tools Catalogue and the Mosaic AI Agent Framework this week. Developers can build their RAG-based applications using the AI Agent Framework, leveraging the company’s serverless vector search capabilities that became widely available last month.

The Databricks vector search system, according to Ghodsi and Zaharia, employs a hybrid strategy that combines traditional keyword-based search with embedding search. All of this is closely linked to the Databricks data lake, and the data on both platforms is automatically kept up-to-date.

This comprises the Databricks Unity Catalogue governance layer and the platform’s general governance capabilities, which work together to prevent, for example, the leakage of personal data into the vector search service.

About the Unity Catalogue (which the firm is now gradually opening source to as well), it’s important to note that Databricks is currently expanding this system to allow businesses to control which AI tools and features these LLMs can utilize to produce responses.

According to Databricks, this catalog will also improve these services’ discoverability within an organization.

Ghodsi also pointed out that programmers may now use these tools to construct their agents by, for example, chaining together models and functions with Llama Index or Langchain. Indeed, Zaharia informs me that several Databricks clients are currently utilizing these technologies.

Many companies are utilizing agent-like workflows. The sheer volume of these workflows often catches people off guard, but this seems to be the current trend. We’ve also discovered that this is the best approach to developing them in our in-house AI applications, such as the assistant apps on our platform,” he added.

To assess these novel applications Additionally, Databricks is introducing the Mosaic AI Agent Evaluation, an AI-assisted assessment tool that enables businesses to rapidly obtain user input and helps them to label some early datasets.

The platform integrates LLM-based judges to test how well the AI performs in production. The Agent Evaluation has a user interface (UI) that allows users to search and visualize large text datasets. It is based on Databricks’ acquisition of Lilac earlier this year.

“Every customer that we work with tells us that they need to label things internally and that they will assign certain staff to do it. All I need is maybe 500 or 100 answers, and we can give that to the LLM judges,” Ghodsi clarified.

Leveraging Refined Models with Databricks’ Mosaic AI Solutions

Refined models are another method for improving outcomes. To help with this, Databricks now provides a service called Mosaic AI Model Training, which, you guessed it, lets users fine-tune models using their company’s confidential data to improve their performance on specific jobs.

As a “unified interface to query, manage, and deploy any open source or proprietary model,” the Mosaic AI Gateway is the last new tool that the company is selling.

Using centralized credentials storage, the objective is to enable users to query any LLM in a regulated manner. After all, no business wants its engineers to provide arbitrary data to unaffiliated services.

The AI Gateway also enables IT to set rate limitations for various vendors to control expenses during periods of budget reduction. These businesses also receive usage tracking and tracing to troubleshoot their systems.

All of these new functionalities are a response to the way that Databricks users are now interacting with LLMs, as Ghodsi informed me. “Over the past quarter and a half, the market has seen a significant change. Starting last year, everyone you spoke to was the pro-open source and thought it was great.

However, people only started using OpenAI when you actively encouraged them to do so. In the background, everyone was using OpenAI, regardless of what they said or how much they praised open source.

These clients now use open models (very few are truly open source, of course), and they are much more sophisticated. As a result, they must adopt a whole new set of tools to address the opportunities and challenges of using open models.

About Post Author

Kiera Collins

https://theaiwired.com/