Download as pdf or txt
Download as pdf or txt
You are on page 1of 42

Table of

Contents
AI Year in Review 04
AI Adoption Trends 16
Top Use Cases By Industry 22 ML Lifecycle 30
Insurance 24 Working with
Foundation Models 31
Retail & eCommerce 25
Data Challenges 33
Financial Services 26
Data Best Practices 35
Logistics & Supply Chain 28
Model Evaluation 37

Conclusion 40

About Scale 42

Methodology 44

ii Scale
A I YEA R I N R EV I E W

The significantly improved


capabilities of generative models
in 2022 enormously impacted
businesses’ AI strategies, with 65%
either accelerating their existing
strategies or creating an AI strate-
gy for the first time.

While most respondents (60%)


are experimenting with generative
models or plan on working with
them in the next year, only 21%
have these models in production.
Companies see the potential of
generative models to improve
their business, but getting them
into production is challenging.
To unlock the power of their data
and take full advantage of these
models, companies need machine
learning expertise, fine-tuning
infrastructure, and the resources
to perform RLHF at scale.

65%
either accelerated their existing
strategies or created an AI strategy
for the first time

06 Scale
AI Y E A R IN R EVI EW

“I think that for this decade, what we’re


really going to see is these tools proliferating.
They’re going to be everywhere. They’re
going to be baked into every company. I
think it’s kind of like the internet transition
If you’re a company, what’s your internet
strategy? And here we are today, where we
don’t even talk about internet strategies. It’s
just so integral to every business. It’s not a
separate part of your business that you can
pick or choose whether you’re going to have
it. And I think that AI is going to be much
the same.”

—GREG BROCKMAN,
PRESIDENT & CO-FOUNDER, OPENAI

(TransformX October 2022)

Zeitgeist: 2023 AI Readiness Report 07


A I YEA R I N R EV I E W

Most companies don’t have the necessary resources


or mandate to create their own generative mod-
els, so they must rely on third parties. Of those
companies that plan on working with generative
models, the vast majority are looking to leverage
open-source generative models (41%) or Cloud API
generative models (37%), while very few are looking
to build their own generative models (22%).

Furthermore, 28% are exclusively using open-source


models, while 26% use cloud APIs (commercially
available models such as Anthropic’s Claude,
OpenAI’s GPT-4, and Cohere’s Command), and only
15% are exclusively building their own.

There are multiple factors that


enterprises must weigh when deciding
on their Generative AI infrastructure,
including their in-house machine
learning expertise, budget, security re-
quirements, and need for domain-spe-
cific capabilities. Leveraging a cloud
provider is the easiest and fastest
path to obtain generative capabilities,
but it comes with higher security
risk, less control over the underlying
models, lower quality performance at
domain-specific tasks, and can be ex-
pensive. Open-source models provide
more control and are cheaper, but
they require more in-house expertise
to deploy and fine-tune. Companies
looking to build their own generative
models benefit from greater control
but incur greater costs from data
collection, compute, and hiring machine
learning experts to train and deploy them.

08 Scale
AI Y EA R I N REV IEW

61% of companies are looking to


AI to help enhance the custom-
er experience, 56% to improve
operational efficiency, and 50%
to increase profitability. Focusing
on customer-centricity benefits
organizations immensely, with
improved customer goodwill in
the short term and greater profit-
ability in the long term.

89% of companies adopting AI benefit


from the ability to develop new prod-
ucts or services, 78% from enhanced
customer experiences, and 76% from
better collaboration across business
functions. These companies also see
improved organizational process
efficiency and profitability. Despite
the positive outcomes for AI adopters,
even greater outcomes are possible as
companies accelerate their AI strategies
and increase their investments in AI.

An organization’s goals also shape the


effectiveness of its AI implementation.
Those companies that list shareholder/
investor demand as the top goal for AI
adoption also show the poorest results
for customer experience, revenue, and
profitability. To ensure success with an
AI implementation, organizations must
avoid implementing AI for the sake of
implementing AI, but instead, ensure
the goals of an AI implementation are
aligned with company priorities and
that AI is a good solution for a given
problem.

Zeitgeist: 2023 AI Readiness Report 09


A I YEA R IN REV I E W

A New
Era Generative models are already
transforming how we create art,
understand our world, and conduct
business.
Large language models help us write content
such as blogs, emails, or ad copy more quickly
and creatively. They summarize long-form
content so that we can quickly understand the
most critical information from reports and
news articles. Diffusion models streamline
marketing workflows, enabling marketers
to generate unlimited and infinitely creative
product imagery. Developers use LLMs to write
code more efficiently and help them quickly
identify and fix bugs. Advanced chatbots enable
businesses to improve their customer service
at a lower cost. Finally, organizations are
unlocking the power of their knowledge bases
by customizing LLMs with their proprietary
data to perform better on tasks unique to their
business.
We will now look at a few key terms and trends
essential to understanding this new era of
Generative AI.

10 Scale
A I Y E AR I N REVI EW

Models Are Increasing in Size

Over time, generative models have become more their specific business use cases.
capable as they’ve increased in size. Model size is Generative models are trained on a large
typically determined by its training dataset size amount of internet data, making them compe-
measured in tokens (parts of words) or by its tent generalists. These models can write poetry,
number of parameters (the number of values the solve logic puzzles, and identify bugs in code.
model can change as it learns). While generative models are great generalists,
• BERT (2018) was 3.7B tokens and 240 million parameters. they are poor specialists when solving problems
• GPT-2 (2019) was 9.5B tokens and 1.5 billion parameters. outside of their data distribution. Since a signif-
• GPT-3 (2020) has 499B tokens and 175 billion parameters. icant portion of data is proprietary to individual
• PaLM (2022) was 780B tokens and 540 billion parameters.
organizations, base large language models are
As these models scale in size, they become not well adapted to these specific domains.
increasingly capable, providing more incentive To improve performance on the specific tasks
for companies to build applications on top. of, say, an insurance company, an eCommerce
Generative models are now more widely avail- company, or a logistics company, these models
able as many large model developers provide must be fine-tuned and aligned to excel at those
APIs or make them open-source, and compa- particular tasks and provide responses that are
nies are quickly adopting these large models to useful to customers and employees.

Zeitgeist: 2023 AI Readiness Report 11


A I YEA R I N R EV I E W

Reinforcement Learning from Human


Feedback (RLHF) ChatGPT

Though Reinforcement Learning from Human Feed- ChatGPT is a large language model that has been
back (RLHF) is not new to the research community, tuned specifically for the task of conversational text
in 2022, it catapulted in popularity as it was a critical generation. ChatGPT was trained with RLHF and
ingredient in the success of ChatGPT. data in dialogue formats to enable it to act as a con-
versational chatbot.
Instead of attempting to write a loss function with
which to train a model, RLHF involves soliciting ChatGPT quickly became one of the most impactful
feedback from human users and training a reward product launches of all time, reaching 1 million users
model on that feedback. This human-defined reward in just five days and currently sits at over 100 million
model is then used to train a base model. This also users.
allows training on much more data since the human
feedback is mimicked by the reward model, so the ChatGPT was initially launched with GPT-3.5 but
dataset size is now only constrained by how many now also includes GPT-4 for ChatGPT plus subscrib-
prompts you can create. ers. These models are highly capable of question
answering, content generation, and summarization.
RLHF tuning results in models better aligned to hu- While these models provide more robust, informa-
man preferences, producing more detailed and factual tive, and creative responses than their predeces-
responses. sors, the real breakthrough for adoption was their
ability to hold conversations with humans. The
RLHF also defines the “personality” and “mood” ability to interact with the models in an intuitive
of the model, making it more helpful, friendly, and way increased the accessibility of the models so that
factual than the base model would be otherwise. anyone can use them.
This means we get responses from the model that
feel more human and less like talking to a machine.
RLHF is a critical component to the success of
recent LLMs and is also critical to ensuring that
enterprises using Generative AI get model responses
that align with their policies and brands.

12 Scale
AI Y EA R I N REV IEW

Prompt Engineer

2022 also saw the proliferation of a new role for ma-


chine learning teams, the “prompt engineer.” While
generative models are competent at generating
the desired output for business use cases, the right
prompt is required to optimize the model’s effec-
tiveness. Prompt engineers carefully craft natural
language inputs to get more consistent and reliable
responses from models so that the model outputs
can then be used in business applications. Instead
of writing an SQL query, these engineers craft finely
honed natural language prompts.

For example, when integrating applications with


LLMs, the varied responses from the model can
break the integration if they are not formatted prop-
erly. Say you are creating an application dealing with
financial data. A user’s input may be related to the
Q4 earnings of a particular company (see graphic,
right).

Prompt engineers help model responses to solve


business challenges with greater accuracy, efficiency,
and quality. They also ensure that responses align
with an organization’s brand guidelines and voice.
They are also critical at finding vulnerabilities in
models by using prompt injection techniques and
helping to resolve those vulnerabilities before an
employee or customer does. This role is highly
valuable in ensuring organizations’ successful imple-
mentation of generative models.

Zeitgeist: 2023 AI Readiness Report 13


A I YEA R I N R EV I E W

“One of the ways that I describe Stable


Diffusion is as a generative search
engine. You don’t need to use image
search anymore cause you can just
make the image. By putting this in a
pipeline at the right place and having
human-in-the-loop interactions like the
work that you do at Scale AI, under-
standing how humans do that at scale,
and having these engines, it will allow
us to have even better experiences that
understand what we want.”

— E M A D M O S TA Q U E ,
F O U N D E R & C E O , S TA B I L I T Y A I

(TransformX October 2022)

14 Scale
AI Y EA R I N REV IEW

What to Expect in 2023


Proprietary data will unlock
Increased investment in AI the power of generative models

As Generative AI is now more capable and widely On their own, base generative models are valuable
available, companies are quickly incorporating it tools. Paired with a business’s proprietary data,
into their operations. 72% of companies will signifi- they become strong differentiators, improving the
cantly increase their investment in AI each year for customer experience, product development, and
the next three years. profitability.

Increasing capabilities of generative


models

Many organizations are now building their own large


“I’m most excited about...the ability
generative models. These models are being incor-
porated into search engines and paired with other for our models to start using tools
tools, including internal knowledge bases, to become out in the world. Giving them access
powerful business tools. These models will also
to knowledge bases, giving them
become multimodal, meaning that they will be able
to consume and generate text, images, and video, access to search engines like Google
making them even more useful than they are today. or Bing, and augmenting them with
You can upload a product catalog with both images that knowledge as a resource so that
and text to a multimodal model, and it will recog-
nize specific products, write product descriptions, instead of them having to memorize
fill in missing attributes, and generate new images to all of these facts, they can make
enrich your product catalog automatically. reference to a live, updated, knowledge
base. I think that’s going to be super
Widespread accessibility of generative impactful.”
models
—AIDAN GOMEZ,
Much like the cloud, widely accessible generative CEO, COHERE
models represent a paradigm shift for companies.
(TransformX October 2022)
Incorporating this type of AI will quickly become the
status quo, and those who are slow to adopt will be
left behind.

Zeitgeist: 2023 AI Readiness Report 15


AI Adoption

16 Scale
Adoption Trends
Business leaders have identified that AI is critical

72%
to the future of their companies and are looking of companies are looking to
increase their investment in AI
to adopt it as quickly and with as much impact
each year for the next three years
as possible. We examine this trend and provide
insights on best practices.

59% of companies view AI as critical or highly As companies view AI as more critical to the fu-
critical to their business in the next year, and ture success of their business, they are increasing
69% in the next three years. The increasing AI investments over the next three years. 72% of
capabilities and availability of Generative AI will companies plan to increase their investment in AI
accelerate AI adoption. each year for the next three years.

Zeitgeist: 2023 AI Readiness Report 17


Companies are spending less time and effort on Of companies making
traditional computer vision applications and instead significant investments in AI:
focusing on LLMs and Generative AI. Of compa-
nies making significant investments in AI, 52% are

52%
are in investing heavily
investing heavily in LLMs, 36% in generative visual in LLMs
models, and 30% in computer vision applications.
With the recently popularized capabilities of LLMs,
companies have rapidly shifted their AI strategies to

36%
harness the power of Generative AI. are in investing heavily
in generative visual
models

30%
are in investing in
computer vision models

18 Scale
What outcomes are achieved by companies that adopt AI?
As mentioned previously, com-
panies adopting AI are seeing
positive outcomes from im-
proved customer experiences,
the ability to develop new prod-
ucts or services and improve
existing products, and improved
collaboration across business
functions.

Across the board, companies


adopting AI are achieving pos-
itive outcomes in almost every
category. Like any technology
program, the success of AI
programs depends on the ability
to implement AI and align imple-
mentation efforts with measur-
able organizational goals.

“I really believe that we are at a transformative moment today where ML is moving at an incredible
speed and problems that were thought to be too complex to solve with computers a few years ago,
are now being solved by applying machine learning. So we have this great opportunity. If machine
learning becomes more accessible, the world will move faster, our economy will move faster, science
will move faster.”

— F R A N C O I S C H O L L E T,
AI ENGINEER AND RESEARCHER, GOOGLE

(TransformX October 2022)

Zeitgeist: 2023 AI Readiness Report 19


Which resources do companies feel they have enough of in
order to successfully implement their AI strategies? Which
resources are they lacking?

Companies that view AI as critical to their business While leaders have identified the need to adopt AI,
indicate they have the executive support, strate- the execution of these strategies is difficult, nu-
gy and vision, and budget they need to succeed in anced, and heavily dependent on expertise. The field
implementing their company’s AI strategy. However, is moving so quickly that it is difficult to keep up
these companies generally lack the necessary exper- with the pace of advancement. Highly talented peo-
tise, software, and tools required to achieve success. ple with expertise in Generative AI are simply not
available to most organizations. Similarly, selecting,
standardizing, and updating the software and tools
associated with Generative AI, MLOps, and even
DevOps is challenging for companies without ded-
icated teams to keep up with these changes as the
requisite tech stacks are constantly evolving.

20 Scale
“As a product of this shortage in AI talent,
most businesses are missing out on a huge
opportunity to integrate this tech into
products and into their developer’s workflows.
Consumers are missing out on products that
have more magical, intuitive, and smart
experiences. The start of the fix comes with
the product folks making the decision about
what’s prioritized, looking at what they can
do, understanding where the technology is
today, and where they could insert it, and then
building it. I think we need to start making AI a
standard piece of every single product. I don’t
think consumers are going to tolerate dumb
products anymore. We need to make them
much, much smarter.”

—AIDAN GOMEZ,
CEO, COHERE

(TransformX October 2022)

Zeitgeist: 2023 AI Readiness Report 21


TO P USE CASES BY I N D U ST RY

AI Adoption
by Industry
80%
Every industry is looking to
increase its AI budgets over the
next three years. Those that top
the list are: Insurance

79% 77%
Logistics & Financial
Supply Chain Services

75% 74%
Healthcare & Retail &
Life Sciences eCommerce

22 Scale
TO P U SE CASE S BY I N D U STRY

While all industries are increasing their AI budgets,


each industry has unique use cases. These range from
insurance companies looking to reduce claim processing
times to eCommerce companies implementing customer
service chatbots. We examine how a few key industries
are adopting AI in 2023.

Zeitgeist: 2023 AI Readiness Report 23


TO P USE CASE S BY I N D U ST RY

Insurance
Insurance companies look to AI
to help them improve customer
experience and improve opera-
tional efficiency.

To help achieve these goals,


insurance companies are looking
to adopt AI to improve claims
processing, fraud detection, and
risk assessment/underwriting.

For claims processing in particu-


lar, insurance companies believe
that AI can help reduce time
to process claims and reduce
processing errors which will
result in a better experience for
their customers and improved
operational efficiency.

24 Scale
TO P U S E CAS ES BY I N D U STRY

Retail and eCommerce


Retail and eCommerce companies
look to AI to help them grow
revenue, improve the customer
experience, and increase opera-
tional efficiency.

To help achieve these goals,


retail and eCommerce companies
are adopting AI to improve the
customer experience with more
capable chatbots. They also want
to improve operational efficiency
with more productive content
and marketing operations built on
AI-generated product imagery and
descriptions. These companies are
also enhancing their operational
efficiency with better forecasting,
purchasing, pricing, and inventory
management. Retail and eCom-
merce companies are not focusing
on adopting AI to directly grow
revenue but are indirectly looking
to influence revenue growth
with increased operational
efficiency and an improved
customer experience.

Zeitgeist: 2023 AI Readiness Report 25


TO P USE CASE S BY I N D U ST RY

Financial Services
Financial services companies
look to AI to help them enhance
the customer experience, grow
revenue, and increase operational
efficiency.

To help achieve these goals,


financial services companies are
looking to adopt AI to improve
investment research, fraud de-
tection, customer-facing process
automation, and to power person-
alized chatbots.

For investment research in partic-


ular, financial services companies
are applying AI to summarize
content, detect trends, and clas-
sify topics to improve investment
decisions, resulting in increased
revenue and improved operation-
al efficiency. By improving their
investment decisions, these orga-
nizations will indirectly improve
the customer experience through
presumably higher returns.

26 Scale
TO P U S E CAS ES BY I N D U STRY

Financial Services
Content summarization includes
summarizing data sources such
as financial statements, histori-
cal data, news, and social media.
Trend detection is applying AI
to data to help identify patterns
humans are otherwise ill-equipped
to detect.

Financial services companies are


using financial statements, his-
torical market data, and 3rd party
data in their investment models.
Fewer are relying on social media
content and geospatial/satellite
imagery.

Zeitgeist: 2023 AI Readiness Report 27


TO P USE CASE S BY I N D U ST RY

Logistics and Supply Chain


Logistics and supply chain com-
panies adopt AI to help them
improve operational efficiency,
improve customer experience, and
grow revenue.

To help achieve these goals,


logistics and supply chain com-
panies are looking to adopt AI for
better inventory management and
demand forecasting, improved
route optimization, to deploy au-
tonomous vehicles, and improve
document processing throughput
and quality. These tools directly
impact operational efficiency,
which has downstream impacts on
the overall customer experience,
with reliable delivery and fewer
delays.

For inventory management and


demand forecasting, logistics
and supply chain companies are
adopting AI to help reduce costs,
improve customer satisfaction,
and improve forecast accuracy.

28 Scale
TO P U S E CAS ES BY I N D U STRY

Logistics and Supply Chain


For route planning, logistics and
supply chain companies believe AI
can help improve efficiency, reduce
costs, improve delivery accuracy,
and reduce shipping times. This
directly translates to improved
operational efficiency and a better
customer experience while indirect-
ly translating to revenue growth.

For document processing, logistics


and supply chain companies look
to AI to help them with informa-
tion processing, document clas-
sification, and compliance. This
application is strictly dedicated to
increasing operational efficiency
and reducing costs.

Logistics and supply chain com-


panies process a lot of paperwork.
To improve operational efficiency,
this paperwork must be pro-
cessed as quickly and accurately
as possible. Logistics documents,
such as bills of lading, commercial
invoices, and packing lists are full
of critical information required to
clear shipments past customs and
onto warehouses for distribution.
Traditional OCR applications re-
quire the creation of templates for
each type of document, which is
infeasible and inefficient for global
logistics companies. Instead, these
companies rely on machine-learn-
ing-powered document process-
ing, which requires no templates
and still processes the documents
at over over 95% accuracy.

Zeitgeist: 2023 AI Readiness Report 29


ML Lifecycle

30 Scale
Working with Foundation Models
As of late 2022, the most widely used foundation models were
BERT, GPT-3, Stable Diffusion, BLOOM, and T5/FLAN.
However, this landscape is quickly changing as more and more powerful generative models are being developed.*

BERT plays a critical but quieter role in many orga- to state-of-the-art models like GPT-4. The compute
nizations today, providing natural language under- required to run these more sophisticated models will
standing capabilities at a significantly reduced cost become cheaper over time, and companies will have
compared to larger models such as GPT-3. more third-party tools and expertise available to help
integrate these larger models into their operations.
However, this trend is shifting as BERT is not a gen- Open-source models such as LLaMA are already be-
erative model, so its use cases are limited compared ing optimized to run on consumer laptops.

HELP IMPROVE THESE INSIGHTS


We’d appreciate your help to improve these insights! Given the rapid progress of Generative AI and the
frequent introduction of new models, we’re curious to learn which models are now the most widely
adopted. Please take a moment to fill out the survey.

*(ChatGPT/GPT-3.5/GPT-4 were not included in this survey as survey collection began in late 2022 before these models were launched)

Zeitgeist: 2023 AI Readiness Report 31


As previously discussed, of those companies that
plan on working with generative models, the vast
majority are looking to leverage open-source
generative models (41%) or cloud API generative
models (37%), while very few are looking to build
their own generative models (22%).

The majority of companies using Cloud API LLMs


are using OpenAI (64%), followed by AI21labs
(26%) and Cohere (26%). Google and Anthropic
recently launched their own LLMs, and several
other companies expect to launch their own mod-
els in 2023.

To get the most value out of large generative


models, many enterprises will need to fine-tune

64%
foundation models using their proprietary data
and knowledge bases. The biggest challenges
for fine-tuning foundation models are acquiring
training data and the necessary ML infrastructure.
Organizations working with generative models
find it challenging and resource-intensive to fine-
tune their models in-house. The most effective
techniques, like RLHF, require humans to apply
feedback to model outputs, custom software, and
26%
26%
specialized skill sets to ensure high quality and lim-
it human biases in models.

32 Scale
Data Challenges

Data quality remains the most challeng-


ing part of acquiring training data, closely
followed by data collection. The challenges
for acquiring training data are nearly iden-
tical to last year’s survey findings.

Curating data and annotation quality are


still the top challenges for companies
preparing training data for models. The
biggest change YOY was that it takes too
long to get labeled data (typically greater
than one month), with 21% of respondents
citing this as their #1 challenge, up from
only 13% a year ago. Companies are looking
to move faster to label their data, but it is
difficult to keep up.

Zeitgeist: 2023 AI Readiness Report 33


The majority of respondents continue to have problems with
their training data.

Most respondents continue to face problems with


the quality of their training data. Data noise remains
the number one issue (64%), followed by data bias
(50%), and domain gaps (43%).

34 Scale
Data Best Practices
Companies that invest in good data annotation infrastructure
can deploy new models, retrain existing ones, and deploy them
into production faster.

Our data show that there seems to be a correla-


tion between teams that get annotated data faster
and teams that deploy new models to production
faster and are able to update existing models more
frequently.

Zeitgeist: 2023 AI Readiness Report 35


Respondents that only label their data manually most rank labeling quality as their number one challenge,
frequently rank labeling quality as their number one compared to 47% using automated labeling and only
challenge in preparing training data, while those that 44% using human-in-the-loop labeling. The combina-
leverage human-in-the-loop and automated labeling tion of automated labeling plus human-in-the-loop
rank this challenge less frequently. 60% of respon- is recommended as a best practice as it nearly always
dents leveraging manual labeling in some capacity outpaces the accuracy and efficiency of either alone.

36 Scale
Model Evaluation

While most of the report has focused on RLHF Just as we found in 2022, measuring the business
and Generative AI, companies apply machine impact of models remains a challenge, especially
learning in many different ways, from object de- for startups or very small companies (those with
tection to recommendation systems. One critical fewer than 250 employees). These companies
component of these production ML systems is rely more on aggregated model metrics as they
how companies evaluate and monitor their per- are building products and a customer base, so the
formance. business impact is difficult to measure. However,
small to medium size organizations (500-9,999
employees) are measuring business impact more
than they were even one year ago, at about 73%
now (compared to 55% in last year’s survey).

Zeitgeist: 2023 AI Readiness Report 37


Larger companies take longer to identify issues in models
As in last year’s report, smaller organizations can caught as quickly as startups or small companies,
usually identify issues with their models quickly, in which are operating simpler systems and closely
less than one week. Larger companies (those with monitoring model metrics. Additionally, these larger
5,000 employees or more) may be more likely to companies typically have a larger customer base,
measure business impact but are more likely than so their users will hit on more edge cases than the
smaller ones to take longer (one month or more) to smaller companies. For these reasons, it is import-
identify issues with their models. ant that as companies scale, they continually refine
their MLOps practices, use data curation tools that
These larger companies are operating complex sys- can help them identify edge cases, and monitor their
tems at scale, and the issues they face are less likely models while continuing to measure business impact.
to cause immediate business impact, so they are not

38 Scale
Model Evaluation Methods vs. Time to Deployment

As we found in last year’s report, ML teams that Although small, agile teams at smaller companies
identify issues fastest with their models are most may find failure modes, problems in their models, or
likely to use A/B testing when deploying models. problems in their data earlier than teams at large en-
terprises, their validation, testing, and deployment
Aggregate metrics are a useful baseline, but as en- strategies are typically less sophisticated. Thus, with
terprises develop a more robust modeling practice, simpler models solving more uniform problems for
tools such as “shadow deployments,” ensembling, customers and clients, it’s easier to spot failures.
A/B testing, and even scenario tests can help validate When the system grows to include a large market
models in challenging edge-case scenarios or rare or even a large engineering staff, complexity and
classes. technical debt begin to grow. At scale, scenario tests
become essential, and even then, it may take longer
to detect issues in a more complex system.

While we have covered best practices for evaluat- appropriately, and human judgment is required to
ing models, we must note that for the first time, evaluate correctness. That means any sound model
the performance of Generative AI models is nearly evaluation, whether done by a generative model
impossible to evaluate automatically. This is because builder or an enterprise, will require human-in-the-
there are many ways for the model to respond loop validation and verification on an ongoing basis.

Zeitgeist: 2023 AI Readiness Report 39


Conclusion
Generative AI is rapidly transforming
the world, and businesses need
to understand how to adopt this
technology quickly or get left behind.
The most significant AI and ML readiness trend has Enterprises widely use older non-generative models
been the enormous impact of Generative AI on com- like BERT, but have realized they must adopt more
panies, large and small, across all industries. While generative models to stay ahead of the competition.
the 2022 AI Readiness Report focused on companies Companies that fine-tune foundation models find
with in-house machine learning expertise, this year’s their most significant challenges are acquiring train-
report examined how all companies can adopt AI. ing data, ML infrastructure, and comparing exper-
This change in focus reflects the zeitgeist: dramatic iments across different models. Human evaluation
improvements in the capabilities of Generative AI to has replaced benchmarks as the de-facto method to
accelerate innovation and transform every business. analyze large generative models and determine how
they will work in a specific enterprise. Enterprises
We found that many companies plan to work with and governments need to leverage their unique data
or experiment with foundation models, but many to unlock the full potential of generative models.
lack the expertise and tools to get these models
into production. Most companies are adopting AI to At Scale, our mission is to accelerate the develop-
enhance the customer experience, optimize opera- ment of AI applications to power the most ambitious
tional efficiency, or improve profitability. Generative AI projects in the world. To support that mission, we
models will become increasingly more useful and are excited to share the results of the Scale Zeitgeist:
widely accessible, making them essential to every AI Readiness Report with you. We will continue to
organization’s business strategy with a similar im- shed light on what it really takes to adopt AI and help
pact to the internet. Early adopters of AI are seeing you separate the signal from the noise.
the improved ability to develop new products or
services, enhanced customer experience, and better
collaboration across business functions, in addition
to improved revenue and profitability.

40 Scale
“Large generative models are already
giving people a productivity boost—
we’ve seen how these systems help
people write, code, learn, and more.
We expect the capabilities of these
models to rapidly improve, possibly
beyond our imagination. If we can
learn how to safely integrate AI
into businesses by creating helpful,
harmless, and honest systems, it
could have a transformative effect
on the economy and industries as we
know them.”

—JA R E D K A P L A N
C H I E F S C I E N T I S T, A N T H R O P I C

Zeitgeist: 2023 AI Readiness Report 41


About
Scale
At Scale, our mission is to
accelerate the development of
AI applications. We believe that
to make the best models, you
need the best data.
The Scale Enterprise Generative AI Plat-
form leverages your enterprise data to
customize powerful base generative models
to safely unlock the value of AI. The Scale
Data Engine consists of all the tools and
features you need to collect, curate and
annotate high-quality data, in addition to
robust tools to evaluate and optimize your
models. Scale powers the most advanced
LLMs and generative models in the world
through world-class RLHF, data generation,
model evaluation, safety, and alignment.
scale.com

42
Zeitgeist: 2023 AI Readiness Report 43
Methodology

44 Scale
This survey was conducted online within the sciences (4%), media/entertainment/hospitality
United States by Scale AI from December 15, 2022, (3%), manufacturing (2%), and other (6%).
to January 25, 2023. We received 2,909 responses
from ML practitioners (e.g., ML engineers, data Many respondents (31%) represent organizations
scientists, development operations, etc.) and that are advanced in their AI/ML adoption—they
leaders involved with AI in their companies. After have multiple models deployed to production and
data cleaning and filtering out those who indicated are regularly retrained. About 18% are slightly less
they are not involved with AI or ML projects and/ advanced—they have multiple models deployed
or are not familiar with any steps of the ML de- to production—while 8% have only one model
velopment lifecycle, the dataset consisted of 1,699 deployed to production, 12% are developing their
respondents. We examined the data as follows: first model, and 11% are only evaluating use cases.
When asked to describe their level of seniority in
their organizations, over one-third of respondents
(37%) reported they are an individual contribu-
tor, nearly one quarter (25%) said they function
as a team lead, 33% are a department head or
executive, and 4% are owners. Most come from
small companies with fewer than 500 employees
(39%) or large companies with more than 10,000
employees (21%).

24% of respondents represent financial services/


insurance, 22% represent the software/Internet/
telecommunications industry, followed by retail
and eCommerce (13%), logistics and supply chain
(9%), education (7%), business and customer ser-
vices (6%), automotive (5%), healthcare and life

Zeitgeist: 2023 AI Readiness Report 45

You might also like