True to their name, generative AI models generate text, images, code, or other responses based on a user’s prompt. Organizations that utilize them correctly can see a myriad of benefits—from increased operational efficiency and improved decision-making to the rapid creation of marketing content. But what makes the generative functionality of these models—and, ultimately, their benefits to the organization—possible?

That’s where the foundation model enters the picture. It’s the underlying engine that gives generative models the enhanced reasoning and deep learning capabilities that traditional machine learning models lack. Together with data stores, foundation models make it possible to create and customize generative AI tools for organizations across industries that are looking to optimize customer care, marketing, HR (including talent acquisition), and IT functions.

Foundation models: The driving force behind generative AI

Also known as a transformer, a foundation model is an AI algorithm trained on vast amounts of broad data. The term “foundation model” was coined by the Stanford Institute for Human-Centered Artificial Intelligence in 2021.

A foundation model is built on a neural network model architecture to process information much like the human brain does. Foundation models can be trained to perform tasks such as data classification, the identification of objects within images (computer vision) and natural language processing (NLP) (understanding and generating text) with a high degree of accuracy. They can also perform self-supervised learning to generalize and apply their knowledge to new tasks.

Instead of spending time and effort on training a model from scratch, data scientists can use pretrained foundation models as starting points to create or customize generative AI models for a specific use case. For example, a foundation model might be used as the basis for a generative AI model that is then fine-tuned with additional manufacturing datasets to assist in the discovery of safer and faster ways to manufacturer a type of product.

A specific kind of foundation model known as a large language model (LLM) is trained on vast amounts of text data for NLP tasks. BERT (Bi-directional Encoder Representations from Transformers) is one of the earliest LLM foundation models developed. An open-source model, Google created BERT in 2018. It was pretrained on a large corpus of English language data with self-supervision and can be used for a variety of tasks such as:

  • Analyzing customer/audience sentiment
  • Answering customer service questions
  • Predicting text from input data
  • Generating text based on user prompts
  • Summarizing large, complex documents

Foundation models versus traditional machine learning models

A foundation model used for generative AI differs from a traditional machine learning model because it can be trained on large quantities of unlabeled data to support applications that generate content or perform tasks.

Meanwhile, a traditional machine learning model is typically trained to perform a single task using labeled data, such as using labeled images of cars to train the model to then recognize cars in unlabeled images.

Foundation models focused on enterprise value

IBM’s watsonx.ai studio a suite of language and code foundation models, each with a geology-themed code name, that can be customized for a range of enterprise tasks. All watsonx.ai models are trained on IBM’s curated, enterprise-focused data lake. 

Available now: Slate

Slate refers to a family of encoder-only models, which while not generative, are fast and effective for many enterprise NLP tasks.

Coming soon: Granite

Granite models are based on a decoder-only, GPT-like architecture for generative tasks.

Coming soon: Sandstone

Sandstone models use an encoder-decoder architecture and are well suited for fine-tuning on specific tasks.

Coming soon: Obsidian

Obsidian models utilize a new modular architecture developed by IBM Research, providing high inference efficiency and levels of performance across a variety of tasks.

Connecting foundation models with data stores for generative AI success

Without secure access to trustworthy and domain-specific knowledge, foundation models would be far less reliable and beneficial for enterprise AI applications. Fortunately, data stores serve as secure data repositories and enable foundation models to scale in both terms of their size and their training data.

Data stores suitable for business-focused generative AI are built on an open lakehouse architecture, combining the qualities of a data lake and data warehouse. This architecture delivers savings from low-cost object storage and enables sharing of large volumes of data through open table formats like Apache Iceberg, built for high performance analytics and large-scale data processing.

Foundation models can query very large volumes of domain-specific data in a scalable, cost-effective container. And because these types of data stores combined with cloud allow virtually unlimited scalability, a foundation model’s knowledge gaps are narrowed or even eliminated over time with the addition of more data. The more gaps that are closed, the more reliable a foundation model becomes and the greater its scope.

Data stores provide data scientists with a repository they can use to gather and cleanse the data used to train and fine-tune foundation models. And data stores that take advantage of third-party providers’ cloud and hybrid cloud infrastructures for processing a vast amount of data are critical to generative AI cost-efficiency.

The business benefits of foundation models and data stores

When foundation models access information across data stores and are fine-tuned in how they use this information to perform different tasks and generate responses, the resulting generative AI tools can help organizations achieve benefits such as:

Increased efficiency and productivity

Data science

Data scientists can use pretrained models to efficiently deploy AI tools across a range of mission-critical situations.

Dev

Developers can write, test and document faster using AI tools that generate custom snippets of code.

Internal communications

Executives can receive AI-generated summaries of lengthy reports, while new employees receive concise versions of onboarding material and other collateral.

Operations

Organizations can use generative AI tools for the automation of various tasks, including:

  • Classifying and categorizing data
  • Communicating with customers
  • Routing messages to the appropriate department for faster response times
  • Generating reports
  • Booking meetings and scheduling appointments

Faster content generation

Marketing teams can use generative AI tools to help create content on a wide range of topics. They can also quickly and accurately translate marketing collateral into multiple languages.

More accurate analytics

Business leaders and other stakeholders can perform AI-assisted analyses to interpret large amounts of unstructured data, giving them a better understanding of the market, reputational sentiment, etc.

IBM, foundation models and data stores

To help organizations multiply the impact of AI across your business, IBM offers watsonx, our enterprise-ready AI and data platform. The platform comprises three powerful products:

  • The watsonx.ai studio for new foundation models, generative AI and machine learning
  • The watsonx.data fit-for-purpose data store, built on an open lakehouse architecture
  • The watsonx.governance toolkit, to accelerate AI workflows that are built with responsibility, transparency and explainability.
Visit the watsonx webpage to learn more
Was this article helpful?
YesNo

More from Artificial intelligence

AI transforms the IT support experience

5 min read - We know that understanding clients’ technical issues is paramount for delivering effective support service. Enterprises demand prompt and accurate solutions to their technical issues, requiring support teams to possess deep technical knowledge and communicate action plans clearly. Product-embedded or online support tools, such as virtual assistants, can drive more informed and efficient support interactions with client self-service. About 85% of execs say generative AI will be interacting directly with customers in the next two years. Those who implement self-service search…

Bigger isn’t always better: How hybrid AI pattern enables smaller language models

5 min read - As large language models (LLMs) have entered the common vernacular, people have discovered how to use apps that access them. Modern AI tools can generate, create, summarize, translate, classify and even converse. Tools in the generative AI domain allow us to generate responses to prompts after learning from existing artifacts. One area that has not seen much innovation is at the far edge and on constrained devices. We see some versions of AI apps running locally on mobile devices with…

Chat with watsonx models

3 min read - IBM is excited to offer a 30-day demo, in which you can chat with a solo model to experience working with generative AI in the IBM® watsonx.ai™ studio.   In the watsonx.ai demo, you can access some of our most popular AI models, ask them questions and see how they respond. This gives users a taste of some of the capabilities of large language models (LLMs). AI developers may also use this interface as an introduction to building more advanced…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters