Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

Thriving in a Data World

Thriving in a Data World


A Guide for Leaders and Managers

Sangeeta Krishnan
Thriving in a Data World: A Guide for Leaders and Managers

Copyright © Business Expert Press, LLC, 2023.

Cover design by Charlene Kronstedt

Interior design by Exeter Premedia Services Private Ltd., Chennai, India

All rights reserved. No part of this publication may be reproduced,


stored in a retrieval system, or transmitted in any form or by any
means—electronic, mechanical, photocopy, recording, or any other
except for brief quotations, not to exceed 400 words, without the prior
permission of the publisher.

First published in 2022 by


Business Expert Press, LLC
222 East 46th Street, New York, NY 10017
www.businessexpertpress.com

ISBN-13: 978-1-63742-416-2 (paperback)


ISBN-13: 978-1-63742-417-9 (e-book)

Business Expert Press Big Data, Business Analytics, and Smart ­


Technology Collection

First edition: 2022

10 9 8 7 6 5 4 3 2 1
To my late dad
Without whom this book would not be possible
Who would be proud to read my book
Description
Currently, every company, no matter its size, is data-driven in one way
or another; using data to improve customer experience, as a new value
stream, and to stay competitive. However, many business l­eaders,
professionals, and students—such as executives, business analysts,
­
UI/UX designers, project managers and marketing teams—are forced to
­interact with data and those who generate data, without being taught
the general competencies needed to feel comfortable having these
conversations.
This book focuses on the foundations needed to be successful in
managing and engaging with data analytics initiatives, bridging the gap
between creators and users of data. As a management reference guide, it
discusses the different types of data strategy needed for succeeding with
data, covering topics such as data team composition, types of data ana-
lytics, the importance of data storytelling, and identifying data return on
investment (ROI).
Framed by the author’s personal story, the trove of information is
made tangible through the compelling narrative with its unprecedented
accessibility and readability for a nontechnical audience.
If you suffer from fear of data, anxiety around conversations with
technical teams, this practical approach book can help with actions
you can start implementing right away.

Keywords
data analytics; marketing business insights; data literacy; presentation;
visualizations; big data businesses; information management tools; data
ROI; data value stream; automation; managing data teams; data mining;
nontechnical guide, agile practices; databases; strategy
Contents
Testimonials������������������������������������������������������������������������������������������xi
Introduction����������������������������������������������������������������������������������������xiii

Chapter 1 The Ever-Changing Data Ecosystem������������������������������1


Chapter 2 Essential Data Analytics 101 and Beyond��������������������15
Chapter 3 Finding the Right Mix for Your Analytics Team�����������27
Chapter 4 Crack the Data Product Management Code����������������43
Chapter 5 Make ROI From Data Analytics a Reality��������������������57
Chapter 6 Why Do You Need Data Literacy?�������������������������������69
Chapter 7 Data Storytelling: Insider Secrets���������������������������������87
Chapter 8 A Sneak Peek Into Advanced Analytics����������������������101
Chapter 9 Data Glossary: Buzzwords�����������������������������������������111

Conclusion�����������������������������������������������������������������������������������������115
Notes�������������������������������������������������������������������������������������������������117
References�������������������������������������������������������������������������������������������121
About the Author��������������������������������������������������������������������������������127
Index�������������������������������������������������������������������������������������������������129
Testimonials
“Full of practical advice based on the author’s years of experience working on
and leading data teams. Sangeeta Krishnan focuses on realistic solutions to the
challenges of building effective analytics teams, particularly given today’s rap-
idly changing data landscape.”—Shannon Moore, Principal Consultant,
Daugherty Business Solutions

“Did you know how many focus areas or what different team models there
are for data teams? This book is an indispensable in-depth guide for leaders
who need to build a dream data team and a data-driven culture.”—Evelyn
Münster, Data Visualization Designer, Chart Doktor, Germany

“Practical introduction and guide to use data analytics in any organization,


with examples and ideas ready to implement immediately. A must-read for
every leader who wants to succeed with data.”—Zbigniew Gasiorek, Staff
Software Engineer, Walmart Global Tech
Introduction
Do you struggle with conversations about data?
Are you hesitant to ask data questions for fear of looking dumb?
Do you struggle with not knowing where to start with data?
Are you overwhelmed with the data terms and information a­ vailable?

We hear so many buzzwords every day, like big data, machine learning,
artificial intelligence, data-driven decision making, data insights, and many
more. Take the word insights. Everyone uses insights these days, but with-
out a common definition or understanding. The assumption seems to be
that if we collect and use data, this magical insight will show up. Machine
learning carries a similar mystique—people are liable to think of it as a
very advanced field only a select few can understand. In addition, there
are also so many data tools and the list keep growing, making it confusing
to understand the starting point.
Can you name something that is part of your daily life that uses
machine learning? Think supervised learning algorithms. Can you guess
what I’m talking about? If you’ve ever viewed a notification on your
smartphone like iPhone that reads, “You have a new memory,” you’ve
seen machine learning at work. A decade ago, I never would have thought
my phone would identify all my food-related photos and group them
together with a title like “Bon Appétit Over the Years.” It is refreshing
to see the chef side of me making so many mouthwatering dishes over
the years, but I did nothing to create this memory. The data side of me
knew the answer. My phone used on-device machine learning to analyze
every photo in my library in a variety of ways, including scene classifi-
cation (grouping my food pictures), people and pet identification, and
audio classification.1
Data is embedded in so many aspects of modern society, and
­integrated seamlessly, that we often don’t even recognize it as artificial
intelligence. Our entertainment recommendations, modern cars—with
xiv Introduction

drowsiness detection!—and fitness trackers are all driven by data and


­artificial intelligence.
Over the past decade, and especially during the COVID-19 pan-
demic, the data landscape has changed drastically. In 1880, it took the
United States Census an estimated eight years to process census data for
a population of 50 million. By 1890, the German American statistician
Herman Hollerith had introduced the electric tabulating system, improv-
ing processing speeds significantly. The system was the first of its kind
data processing machine to replace manual processing by hand.2
We have come a long way in terms of data processing speed, from
­several years in the past to the point where we now sometimes demand
data available in close to real time. Today, every company, no matter
its size, is data driven in one way or another. Companies are using data
to improve customer experience, create a new value stream, and stay
­competitive. So, understanding data is a critical necessity for everyone.
I have worked with Fortune 500 organizations, not-for-profits, and
everything in between. And within various organizations, I have repeat-
edly seen a gap between data teams on the one hand and leadership or
business stakeholders on the other. These parties are not aligned in their
data goals or in their understanding of the uses and capabilities of data.
This book arose from my desire to close this gap and provide an easy-to-
follow, technology-agnostic reference guide with relatable examples.
Many business professionals and students—such as business analysts,
UI/UX designers, project managers, marketing teams, finance teams,
and executives—are forced to interact with data and those who generate
it. Few have been taught the general competencies needed to feel com-
fortable having these conversations. There are tons of information about
getting into the field of data and working with data. How many articles,
books, blogs, or videos have you read or watched in the past six months?
This data information overload actually makes people less likely to retain
knowledge. Our brains do not retain what we read unless we use and
experiment with it. My aim is to enable people to want to learn more
about data, to be curious about data, and who to reach the Data as a
Hobby stage and wish to level up to analytical thinking. This book is
arranged with the information you need to thrive in your organization—
and nothing more. (No information overload!)
Introduction xv

Working with data requires experimentation and questioning. It is a


quest to discover the unknown, which are elements of a curious mindset.
Unfortunately, curiosity is not considered an essential part of the recruit-
ment process; neither is it encouraged nor promoted as a value in most
organizations. But thriving with data is not about knowing a bunch of
coding languages and technical tools. It is about maintaining a curious
mindset, gaining a foundational data understanding, and seeking out
answers to questions asked—and questions not yet asked in addition to
the requested business requirements. If you develop a curious mindset
around data, finding the right tool for the job will be easier (and will help
you avoid learning a tool only to realize it is not the one you need most).
This book focuses on developing that necessary core ­ understanding
about data.
This book is the go-to guide for any business reader who wants to
understand the language of organizational data and feel comfortable con-
versing the language of data. What’s more, Thriving in a Data World is
unprecedented in its accessibility and readability for the nontechnical
reader.
This book focuses on the foundations you need to successfully man-
age and engage with data analytics initiatives, and to bridge the gap
between the creators and users of data. As a management reference guide,
it discusses the different types of data strategies needed to succeed with
data, and it covers topics such as data team composition, types of data
analytics, the importance of data storytelling, and identifying data ROI.
In psychology, Picture Superiority effect—or the famous saying “a
picture is worth a thousand words”—refers to people’s propensity to
remember pictures better than text. Data is no different, and it is there-
fore essential to gain the skill to tell stories with data. I have worked
with several people who were technically skilled, but who suffered when
it came to presenting the data findings. No one can act on data if no one
follows the data analysis and insights. As a result, this book includes a
chapter about how to persuade with data, irrespective of the tool you use
for visualizations. Even if you follow and understand data, if your com-
pany lacks a data culture, it will resist making any data-driven decisions.
This book discusses how to encourage a data culture and how to over-
come challenges to data culture. It also takes a practical approach to Data
xvi Introduction

Analytics: each chapter contains simple, straightforward actions you can


take immediately to start implementing your newfound knowledge. It’s
an enjoyable, engaging way to learn how to confidently interact, manage,
and work with data analytics teams today.
CHAPTER 1

The Ever-Changing
Data Ecosystem
We humans constantly learn and evolve over generations to build our
modern society. However, at times, there are more sudden changes in the
state of affairs thereby breaking the regular patterns creating r­ evolutions.
In simple terms, when we exchange one way of doing things for some-
thing altogether different, we hope a better society at scale. Industrial
revolutions in the past have emerged in a quest to get to a better next
progressive stage. Industrial Revolution 1.0, for example, involved
­
­coal-­powered production. Industrial Revolution 2.0 entailed gas and elec-
tricity (mass production), and Industry 3.0 was electronics (automation).
The boom of Internet and technology advancements led to the current
revolution: we live in Industry 4.0—the digital age and the Internet of
Things (IoT). This revolution is leading to the creation of a new raw
material—data—and like all other raw materials, data needs to be used
effectively to create something. Data is no longer seen as just something
that benefits corporations by providing competitive advantage. Data is an
economic driver: it accelerates the economic development of a country
and creates more data in the process.
In the past decade, both data collection and data usage have gone
through the roof. Everyday activities such as borrowing books from
a library, banking, fitness and exercising, smart household appliances
such as washing machine and microwave, driving cars, and even dat-
ing are all digital, and many are connected to the Internet. So they
create a lot of usage and preferences data. This data helps industries
to understand user patterns and behaviors and thus create better user
experiences—which again generate data and value. A data ecosystem
operates on a continuous cycle in which we provision with data
to create more data and value. In this ever-changing technological
2 Thriving in a Data World

landscape, something can become obsolete quickly, while something


that was never a possibility can become feasible. Remember, there was
a time when data was mainly used by technology companies, and other
industries considered big data and analytics as tech-centered buzzwords
that had nothing to do with them. Today, all companies—regardless
of industry, size, or geography—need to invest in understanding their
data to get ahead of the competition. The government also collects data
for the wider social and economic development. A variety of services,
like emergency and postal services, depend on accurate address infor-
mation. Denmark, for example, released its standardized and unified
address data to the public free of charge. This single Denmark Address
Register (Open Data) has an annual return of economic benefit that is
70 times its maintenance cost.3
Let’s talk about another example. What do you think of when you
hear the word Farming? It might be acres of land, soil, seeds, crops, or
pesticides, and so on. But the word data does not immediately come
to mind about conventional farming. Conventional farming practices
have used pesticides and fertilizers, along with legacy knowledge and
gut feeling, to increase yield. But modern farming is augmenting deci-
sion making with data. It’s a continuous cycle: -> Collect data using file
sensors -> insights leading to value-driven farming -> create more data
-> repeat. “Climate Corp” is transforming the agricultural industry by
using detailed crop yield data, weather observations from one million
locations in the United States, and 14 terabytes of soil-quality data—all
free from the U.S. ­Government—to help farmers make informed deci-
sions. A company like Climate Corp is feasible because it uses excessive
open-source data.4

Traits of Data
Although there are several traits of data, some aspects such as quality,
relevance, and completeness stand out. I like to remember it as 3 Ds of
data traits—Discover, Digest, Doable. Data is discoverable, you know it
exists and have a way to access it. Data is understandable, you can process
and digest the data as an organization. Data is doable, meaning you can
act on it in meaningful ways creating business value. If you cannot access
The Ever-Changing Data Ecosystem 3

the needed data or even know it exists, cannot understand or interpret


the data, and unable to apply data to decision making, collecting high-­
quality data is of no use. Hence 3 Ds mentioned are critical traits to your
data journey.

Data as Goods or Resource

Unlike other revolutionary industrial goods, like coal, data is nonrival-


rous good. That means that even when data is used for one purpose, its
quantity and efficacy are not depleted for other future uses.5

Variety of Data

The term “Data” is loosely bandied about in a wide variety of contexts. In


the early years, data was mostly structured consisting of numbers, values,
and stored in relational databases like SQL Server. In recent years, the
definition of data continues to expand to encompass unstructured data
like audio, video, images, connected devices, and sensor data. In simple
terms, structured data is organized in a Table, while unstructured data is
distributed across files.

Types of Data

There are all kinds of data, including but not limited to the following:

• Self-produced or quantified self-data, like that gathered by


fitness devices like Fitbit.6
• Open access data.
• Personal data or Protected Health Information (PHI).
• Automatic data from sensors and the IoT.
• Internal company data.
• Transactional data, like purchases made from a website.

Decisions related to data privacy, storage, archiving, and acceptable


usage are all influenced by the type and variety of data being collected,
organized, or processed.
4 Thriving in a Data World

Big Data or Small Data?


Characteristics of Big and Small Data

In the past, all data was small data because there was not an enormous
variety or volume of data available. Today, there is so much hype around
“big data”—the possession of huge volumes of data—that it can be tough
to see beyond the hype. There is a perception that having more data is
­better than less data and that value can be derived only when large vol-
umes of data exist. There is very little discussion about the benefits of
small data or scenarios in which small data knocks big data. Having tons
of data does not mean all of it is equally valuable. Sometimes having small
data can be advantageous.
What does big or small mean in terms of data, anyway? Size is not the
only measure of big data. Data has many facets: the three Vs (volume,
velocity, variety), as well as several others:7, 8

• Volume—size of data;
• Velocity—speed at which data is generated and processed,
even approaching real time;
• Variety—structured, unstructured, or semistructured;
• Value—data’s potential to create value;
• Exhaustiveness—scope of data, for example, from limited to
certain groups to covering an entire population;
• Resolution—coarse to as detailed as possible;
• Relationality—weak ability of data to conjoin different data
sets to strong relations;
• Flexibility—ability of data to accommodate addition of new
fields and scale quickly;
• Experimental—collected as part of research in human
­manageable ways or machine generated and analyzed.

Different facets increase in strength as we move from small data to


big data:
Small data focuses on specific attributes or parts of data sets. It is use-
ful in analyzing current situations, determining causation, and enhancing
The Ever-Changing Data Ecosystem 5

Small Data Big Data

Data Facets

Data Facets

understanding than prediction.9 Unlike small data, big data focuses on


crucial, enduring decisions, which can be predictive in nature.

Benefits of Big and Small Data

Big data does not hold the answers to all data-related problems. While
picking your approach related to data, think in terms of acquisition,
the resources you need to process the data, and storage and privacy costs.
The benefits of using the data should outweigh these associated costs. It
is not about choosing between big and small data, since use cases differ
for both of them. Both big and small data can coexist, and companies can
benefit from choosing the right volume of data for the problem at hand.
The goal is to strike a strategic balance, gaining insights while using the
fewest required resources to get there.
Since small data exists in human-manageable volumes, it translates
into something both experts and laymen can instantly understand and
actions that are easy to deploy. Small data is well suited for research
and experimentation use cases. Small data helps to augment the decision
­making with minimal resources. It can be processed in-house as volume
of data is less and thus reduce negative externalities (external entangle-
ments relating to privacy and consent). Start with small data instead of
big data as it helps to develop skills and ideas with more focus.10 Starting
with big data often emphasizes learning the technical skills rather than the
understanding part that small data fosters.
Big data allows organizations to engage with their users in real time.
It also helps organizations make modifications and enhancements to their
products and services in line with their users’ sentiments, responses, and
comments. Big data helps a company personalize their product, which
can contribute to higher traffic and revenue streams. A common use case
6 Thriving in a Data World

of big data is fraud detection, for which there is a need to analyze mil-
lions of transactions to identify patterns and determine areas with the
most fraud cases. American Express analyzed large volumes of data and
found a pattern in their big data: people who acquired large bills o­n their
­American Express card and then registered a new forwarding address in
Florida were more likely to declare bankruptcy.11 Florida state has one of
the most liberal bankruptcy laws in the United States, and people with
large debts were taking advantage of it. Identifying such correlations in the
data (customers with high credit card balance and relocation to ­Florida)
can help credit card companies to proactively trigger an inquiry and/or
limit future credit increases. Data correlation—pulling data from various
sources to understand the relationship between them and determine a
valuable forward path—is an important benefit derived from big data. It
is also important to look out for unrelated or misleading correlations—
where two things appear to be related to each other but are not—in big
data. If done correctly, however, correlations in big data can powerfully
predict how acting on one factor will modify or influence another.
Bias is all around us in our daily lives. The human brain as a whole can
process 11 million bits of information every second although our con-
scious minds only handle 40 to 50 bits of information per second.12 Our
brain takes shortcuts, leading to both unconscious and conscious bias by
grouping people and things into known types based, for example, on bias
like gender, economic background, or sexuality. It is important to look
back and analyze to understand how our actions are reflective of our bias
and how to break free of them. Data is no different when it comes to bias.
Like small data, big data is prone to bias, but due to its huge volumes, bias
in big data is less obvious. This does not mean that it doesn’t exist. So it
is important to prioritize high-quality, accurate, authentic, and bias-free
data with clear understanding of data lineage elements in big data.

Tools and Techniques for Small and Big Data

Consider implementing tools and techniques in phased stages. In other


words, consider a data infrastructure that handles the current volume,
and continue to build on it as organizational data needs increase. If your
organization is small and data volume is small, do not invest in setting
The Ever-Changing Data Ecosystem 7

up Hadoop yet. This will avoid unnecessary operational headaches and


complexity, which is not needed for a small data organization. At such low
volumes, data processing can be managed in-house in a relational database
like SQL Server or PostgreSQL. In addition, pick a business intelligence
tool like PowerBI to unlock data to everyone in the company. Setting up
data infrastructure for big data organizations is complex, and every com-
pany needs to assess what setup will be most advantageous for their unique
situation; it may be tough to handle this volume in-house. At a high
level, big data organizations can consider a data framework like Apache
Hadoop, along with a NoSQL (Not only SQL) database like MongoDB,
a real-time streaming tool like Kafka, a business intelligence tool, and so
on. These are just examples to help you get started. Big data is vast, and
there is a long list of tools tailored to the specific needs of an organization.

Challenges Maneuvering the Data Landscape


There was a time when data was mainly created and collected on a small
scale, primarily for the operating of companies. We as a society have come
a long way from it. Over the decade, the data landscape has undergone
transformation not just in terms of an increase in data volume but also
in terms of customer expectations. Data ecosystem additions include
unstructured social media data, enhanced privacy needs, technological
improvements, and expectations of close-to-real-time insights. With the
evolution of peer-to-peer engagement channels, more people are going
online to seek connections with friends and strangers alike. Peer-to-peer
engagement channels like Facebook, Twitter, and TikTok generate large
amounts of data daily. No consumer-based company can ignore this
social data and have a strategy based only on their internal data alone.
As wearables and smart devices continue to dominate the market, self-­
created data is growing further. The pandemic has also forced the world
to go online for almost everything, even goods and services previously
considered impervious to online influence. One major shift was people
buying grocery online instead of visiting stores.13 This has ignited another
big data surge.
Companies are required to compete in this rapidly changing data
landscape, where social media users write about many topics and the
8 Thriving in a Data World

perception of a product exercises a strong social influence. This forces


the development of new tools that recognize this fluid landscape and
somehow make sense of the gargantuan daily data dump. For compa-
nies to succeed in this ever-changing data landscape, they must maximize
their ability to effectively collect, analyze, store, and secure data, and to
­innovate and improve efficiencies. These are a few foundational questions
to ask:

• Infrastructure—the what/where/how of data: What types of


data should I collect? Where will I store it? How much data
will I have? How long will I retain it?
• Privacy: How will I secure and protect data?
• Analysis: How will I use this data for decision making? What
kinds of data lineage will benefit me? How frequently will I
need insights—for example, in real time, or once a month?
What is the main purpose of my analysis—for example, to
improve operational efficiencies, innovate, or understand
customer experience or something else?

Coping With the Data Boom

We live in a world that is being transformed by Datafication. Datafication


is a technological trend turning many aspects of our life into data, which
is subsequently transferred into information realized as a new form of
value.14 The primary barrier to advantageously utilizing the data boom
is a lack of understanding about how to apply analytics to improve busi-
ness or create value. Datafication can help with value creation and can be
broken into three concepts: dematerialization, liquification, and density.15
“Dematerialization” is to separate information from the physical world,
which increases its “liquidity” for free movement and thereby increasing
“density” or the value created.
In simple terms, we live increasingly in a data world with more data
than ever before and there are new ways of using data, creating a com-
pletely new value stream. Let us take an example of datafication in Netflix.
Netflix is a subscription-based streaming service in over 190 countries.16 It
The Ever-Changing Data Ecosystem 9

can be easy to forget that in the beginning, Netflix was mail-order DVD
disc delivery business in which subscribers could add and maintain a list
of movies they wanted to rent. The list was ordered, and when a DVD
was returned, Netflix would mail out the next DVD in the list. Although
there was a limit to how many DVDs one customer could borrow at a
time, the list could be as long as they wanted. It was a customer-driven
process: the customer initiated the creation of the list and also managed it
by adding and deleting movies they wished to watch.
This model has changed—it’s become smarter. Its proactive recom-
mendations algorithm removes Netflix’s dependence on the customer to
add movies they want to watch. Of course, Netflix has entered new mar-
kets and countries since its mail-in DVD days, but it has also used its
big data analytics capabilities to better understand content abandonment
(the point at which the customer turned off a show), preferred devices for
viewing, and many other metrics.
Netflix created value through datafication:

• Dematerialization = Move away from physical DVDs.


• Liquidity = Streaming allowed for free movement.
• Density = Create and increase value by evolving from stream-
ing only to content production.

Big data creates a lot of data but big does not always mean better or
improved. Big data presents an ocean filled equally with opportunities
and challenges, and it is up to the organization to sink or swim. Big data
requires advanced tools and techniques, which makes it difficult to under-
stand, organize, and process. In addition, some organizations lack the
financial funding or infrastructure to embark on such a resource-­intensive
journey—not to mention limited or nonexistent visualization tools or
technical experts. The recent explosion of data volume ­amplifies this
issue, but it should not discourage organizations from starting their data
endeavor. One way of coping with this data boom is to start with small
data, processing a manageable volume of information through simple
and widely available open-source tools and gradually progressing toward
large-scale big data initiatives.
10 Thriving in a Data World

Modern Data Stack

When organizations had only small volumes of data, database ­storage


solutions were sufficient. Organizations now find it impossible to
store this spike of data, and they are also aware that in-house data storage
is vulnerable to data breaches and hacking. Organizations are turning
to cloud-based technology to cope with growing storage needs and to
avoid the risk of falling behind the competition. They are adapting to
cloud solutions that do not require them to install software on their own
premises or servers. This also leads to cost savings, since they don’t have
to maintain or purchase physical hardware to utilize the solution. With
rapid growth of customers, cloud solutions are also helpful because they
are quickly scalable. In the past, organizations needed to buy more servers
as their data grew, or else they found themselves stuck with unused serv-
ers as a result of poor forecasting. With cloud storage, organizations can
expand or reduce their resources as demand changes. Cloud solutions also
offer better accessibility and latency. Organizations need not worry about
enduring downtime or having to head back on-site to deal with a data
issue—cloud technologies allow them to do everything remotely.
Cloud migration has changed the way businesses operate by provid-
ing a possibility of reaping the Big Data benefits even for organizations
that lack the physical space, expertise, and large upfront budget. Big data
is no longer the preserve of only large organizations, but it still poses
adoption and implementation challenges for smaller businesses.
To survive this data boom, organizations need both cloud-based solu-
tions to store this enormous amount of data and a strong data strategy.
In other words, they need to develop their advanced analytics capability
by defining how they plan to use their data and what kinds of insights
will benefit them in short- and long terms. Without a strong data strat-
egy, ­collecting data is useless with no definitive way of using it. In my
­experience, organizations struggle to effectively manage and analyze data
in the cloud, simply because it is tough to adapt to new tools and methods.

Cognizance of Data Privacy

Privacy is tough to define. It means different things to different people—


or in different countries. In addition, the definition of privacy changes
The Ever-Changing Data Ecosystem 11

with every advancement in technology. Recently, privacy violations


have been in news, appearing in headlines like “Austrian Website’s Use
of ­Google Analytics Found to Breach GDPR” and “France says Google
Analytics Breaches GDPR When It Sends Data to U.S.”17
A decade ago, privacy was defined simply in terms of what personal
information you disclosed on a website or form. Although there are vari-
ous definitions, privacy can be defined in relation to boundaries between
the self and others, between private and shared spaces, or even in wholly
public forums.18, 19
Now with advanced analytics, businesses track users’ digital behavior
and preferences, often with limited or no knowledge on the part of the
user. They perform data tracking in several ways, for example by setting
opt-out policies (in which the user is opted in by default and must explic-
itly choose to opt out), tracking cookies, and using online advertising and
third-party apps.
On one side, organizations are increasingly applying advanced analyt-
ics concepts to better understand user actions. On the other side, users are
increasingly aware of just how much their personal information is driving
the next generation of products. They are becoming more cautious about
what they share and are more likely to question how it may be used.
High-profile data breaches and privacy scandals are increasing our aware-
ness of data privacy issues, and customers are looking for ways to protect
themselves. Among users, there is a general decline in trust and increase
in anxiety over data privacy. Most Americans feel they have lost control
of how much personal information is collected, and feel the government
should regulate data collection.20
This increased demand for better protection of personal data is push-
ing governments across the globe to roll out new privacy laws and update
existing ones. Privacy laws are slowly catching up with rapid techno-
logical growth. There are several privacy laws and initiatives to control
and protect privacy, like the Health Insurance Portability and Account-
ability Act (HIPAA), General Data Protection Regulation (GDPR),
the F ­ederal Trade Commission (FTC), the Payment Card Industry
(PCI), the  ­Children’s Online Privacy Protection Rule (COPPA), and
the ­California Consumer Privacy Act (CCPA), to name a few. Privacy
laws differ geographically, and organizations must comply with those
12 Thriving in a Data World

laws related to their data handling. Also, some privacy laws are industry
­specific—the HIPAA is health care–related and the PCI governs financial
data. Navigating different privacy laws is not easy, and ongoing changes
to various laws makes it trickier. And there are sometimes fundamental
differences between these laws. For example, GDPR requires entities to
gain user consent via opt-in, while the CCPA requires entities to provide
only an opt-out option. ­Organizations with global locations, handling
transborder flows of data, need to understand and comply with even more
of these regulations.
Data solutioning efforts should place utmost importance on issues of
privacy. And since privacy is an evolving area undergoing constant change,
companies should have mechanisms in place to run data audits and
­identify problems proactively. Based on the type(s) of data the ­company
holds, requirements of de-identification changes need to be considered.
For example, health care protects patient health identifiable (also known
as Protected Health Information—PHI) data, and so organizations need
to implement a de-identification process to protect patients’ personal
information like name, date of birth, and so on. Additional m ­ easures
should also be in place to scan for stray PHI information, train staff, and
encrypt databases.
Details matter when protecting data systems against breaches of
privacy. For example, age by itself is not considered PHI. A report on
someone aged 25 cannot by itself be used to identify a person. But in
a less-populated zip code, it would be much easier to identify a patient
cited as a 98-year-old. As a result, PHI requires any age over 89 years to
be aggregated within a single 90-plus age group instead of exact years.
These are the kind of things to be considered when protecting the privacy
of users.

• There are other new trends and interesting areas of discussion


related to privacy. For example, there are now efforts to allow
people to donate your health care data for research. CMS
Blue Button is a government initiative to allow to share your
Medicare data with third-party applications, doctors, research
programs, and more. It also gives beneficiaries and their
caregivers more options and control over your claims data.
The Ever-Changing Data Ecosystem 13

This is for the betterment of society to improve health care by


donating data to science.21
• A much-debated topic is the data dividend model, in which
you can “sell” your user data. But the question of whether
privacy should be used as a commodity—and who decides the
price—is a contentious one. There are already ad-supported
streaming services that cost less join. These customers are
exchanging their behavioral data for a slightly lower service
fee. And companies like Datacoup purchase data directly from
the individual in exchange for cash, discounts, or cryptocur-
rency.22 Placing control of personal data management into the
hands of the individual is known as personal data economy.23
• The opposite alternative to personal data economy, of course,
is making users pay for their privacy. This model begs the
question: How important is privacy to the individual, and
how much are they willing to pay to protect and secure it?

Data ecosystem is changing at rapid speeds not just in terms of vol-


ume but also in areas of privacy regulations, people’s awareness around
data, and advanced technological advancements. As the saying goes “You
can lead a horse to water, but can’t make it drink.” Similarly, you can gain
all the data-related knowledge and tools but if companies fail to apply
their business expertise to supplement data technologies, there will be no
outcome. Organizations that anticipate the changes, apply their propri-
etary business expertise along with data and plan ahead, are the only ones
that can survive this data boom.
Index
Advanced analytics (AA) software development, 31–32
artificial intelligence (AI), 105 team structure, 40
data as a product, 107–108 value proposition, 40–41
Data as a Service (DaaS), 108 visual designer, 37–38
data monetization, 108 Artificial intelligence (AI), 105
machine learning (ML), 106–107 Augmented analytics, 25
noncoders, tools, 108–110
taxi service Baby Think It Over campaign, 88–89
business problem statement, 103 Big and small data
solution approach, 103–104 benefits, 5–6
surge pricing method, 104–105 bias, 6
Uber, 102 characteristics, 4
Advanced ROI Mindset, 60 data facets, 5
Agile minimum viable product tools and techniques, 6–7
(MVP) Business analytics, 24
creation, 45 Business intelligence (BI), 101
life cycle, 46–47
Analytics, 24–25
California Consumer Privacy Act
Analytics Design Sheet (ADS), 61–62
(CCPA), 11–12
Analytics team
advantages and disadvantages, Center of Excellence (CoE), 33
34–35 Central data team, 32
central data team, 32 Citizen Data Scientist (CDS), 38–39
collective data strength, 41 Climate Corp, 2
crowdsourcing, 33–34 Cloud migration, 10
data adoption, 41 Continuous learning, data literacy
data agility, 28–29 (DL), 77
data analyst, 37 Conventional farming practices, 2
data engineer, 36–37 Covid Tracking Project, 74
data leader, 36 CRoss-Industry Standard Process for
data literacy specialist (DLS), 39 Data Mining (CRISP-DM),
data science, 27–28 50–51
data scientist, 38–39 Crowdsourcing, 33–34
data Steward, 39–40 Curious ROI Mindset, 60
data structure, 29–31
data visualization analyst, 38 Data
embedded data teams, 33 traits, 2–3
IT, 27 types, 3
organization, 28 variety, 3
random group, 27 Data agility, 28–29
small wins, 41 Data analyst, 37
130 Index

Data analytics, 24, 115 communication plan, 83


business functions, 19 enterprise DL spirit, 84
data cleaning, 20 feedback loop, 83
data collection process, 17, 20 incentives, 83
descriptive analytics, 22 inclusive data organization and
diagnostic analytics, 22 society, 84–85
Excel, 19 journey, 82
implementation cost, 18 success measurement, 83–84
insights, 17–18 Data literacy (DL)
missing and inaccurate data, 18 cultural challenges, 76–77
predictive analytics, 23 culture change, 75
prescriptive analytics, 23–24 elements, 79
problem statement, 19–20 scenarios, 77–78
qualitative analytics, 21 data anxiety elimination, 75
quality data, 16 data governance, 79–84
quantitative analysis, 22 definition, 69–70
Technology Acceptance Model enterprise-level, 71
(TAM), 17 Google word search trend, 70
Data as a product, 107–108 leadership buy-in, 73
Data as a Service (DaaS), 108 pilot project launch, 73
Data collection process, 17, 20 rubric, 73–74
Data correlation, 6 show-and-tell approach, 73
Data ecosystem supply and demand side, 71
big and small data, 4–7 technical challenges, 75–76
big data and analytics, 2 elements, 79
datafication, 8–9 scenarios, 77–78
data traits, 2–3 weather data, 72
industrial revolutions, 1 winning tactics, 73–75
modern data stack, 10 Data literacy specialist (DLS), 39
peer-to-peer engagement channels, Data mesh, 112–113
7 Data monetization, 108
privacy, 10–13 DataOps, 53–55
Data engineer, 36–37 Data privacy
Data fabric, 112–113 data solutioning, 12
Datafication, 8–9 data tracking, 11
Data governance definition, 11
automation, 81–82 privacy laws, 11–12
and data strategy alingment, 81–82 protected health information,
as enabler, 80 12–13
forward-thinking, 80 user actions, 11
narrow-focused, 81 violations, 11
partial, 81 Data product management
restrictive, 81 creativity, 44
well-articulated data literacy, 82 CRISP-DM, 50–51
Data hub, 112 data, 44
Data lake, 112 DataOps, 53–55
Data leader, 36 data revolution, 43
Data literacy definition of done, 53
champions, 84 Microsoft TDSP, 52
Index 131

minimum viable product (MVP), Machine learning (ML), 106–107


45–47 Marketing analytics, 24
product road map, 44 Medicare data, 12
Scrum, 47–50 Microsoft TDSP, 52
SEMMA process, 52 Missing Maps, 34
technology, 43 Modern data stack, 10
zettabytes, 43 MongoDB, 7
Data scientist, 38–39
Data Steward, 39–40 Narrow-focused data governance, 81
Data visualization, 90. See also Natural Language Processing (NLP),
Storytelling 106
Data visualization analyst, 38 Netflix, 8–9
Data warehouse, 112
DeepFake, 107 Open Communication Mindset, data
Definition of done (DoD), 53 literacy (DL), 76–77
Dematerialization, 8 Operational analytics, 24
Descriptive analytics, 22
Design spike, 49–50 Partial data governance, 81
Design thinking (DT), 96–99 Paycheck Protection Program, 54
Developing ROI Mindset, 60 Persuasive technology., 89
DevOps, 54–55 Predictive analytics, 23
Diagnostic analytics, 22 Prescriptive analytics, 23–24
Privacy laws, 11–12
Edge analytics, 24 Protected Health Information (PHI),
Embedded analytics, 24 12
Embedded data teams, 33
Exploratory Data Analysis (EDA), 21 Qualitative analytics, 21
Quantitative analysis, 22
Forward-thinking data governance, 80
Restrictive data governance, 81
General Data Protection Regulation Return on investment (ROI)
(GDPR), 11–12 data initiatives, 66–67
intangible, 59
Health Insurance Portability and key performance indicators (KPIs),
Accountability Act (HIPAA), 65–66
11–12 Netflix, 58
organizational mindsets, 59–60
Industrial revolutions, 1 tangible, 58, 60–63
Inner sourcing, 33 Robotic process automation (RPA),
Intangible return on investment 19, 113
(ROI), 59
inventory management suite, 64 Scrum
new patient portal, 64 being structured/rigid, 48–49
Intelligent automation, 19 vs. Kanban, 47
Inventory management system, 62–63 meetings saga overhead, 49
role challenge, 47–48
Kanban, 50 time box challenge, 48
Key performance indicators (KPIs), Scrumban, 50
65–66 SEMMA process, 52
132 Index

Software engineers, 31–32 Tactical Tech, 88


Sports analytics, 15 Tangible return on investment (ROI),
Storytelling, 115 58
dashboard factory, 99 inventory management system,
dashboards, 87 62–63
design thinking (DT), 96–99 new patient portal, 61–62
persuasion Technology Acceptance Model
Baby Think It Over campaign, (TAM), 17
88–89
definition, 89 Undeveloped ROI Mindset, 59–60
failure, 92
power and history, 89–90 Visual designer, 37–38
principles, 91–92 Visual language, 94
Tactical Tech, 88
power, 92–96 Weather data, 72
Surge pricing method, 104–105 Web analytics, 24

You might also like