Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 33

Data Visualization Aurora

A Bit of History!
The history of visualizing data dates back to the 18th century, which was filled
with patriotic wars.

One of the most renowned patriotic war was Napoleon's French invasion of
Russia. It is considered to be the most significant campaign involving millions of
soldiers and several thousands of casualties.
Charles Joseph Minard recorded this remarkable war of Napoleon's march and
retreat with the most magnificent visualization ever.

Let us retreat from history and march towards the concept of Data Visualization!

Data and Visualization


With the digitalization era, data evolved from scarce, expensive to abundant,
cheap and very difficult to process.

And, if you are not aware of the Data Science concepts, data can remain
obscure.
That is where  Data Visualization comes into the picture to rescue us.

they say knowledge is power but how do we make knowledge powerful especially when

that knowledge comes in the form of data lots of data how do we find the meaning tell the

story share the story infographics where data needs design what makes good data

visualization take a few seconds to count the sevens in this number set how many were

there not sure now try it a simple color change makes comprehension almost instant

color is one of several pre-attentive attributes like size orientation flicker visual clues that

the human brain processes within 250 milliseconds now imagine that we're not looking

for specific numbers the patterns we can use color to show correlation size to show

quantity or orientation to show trends and not just data the power of design can be used

to better communicate all sorts of information processes hierarchy and anatomy


chronology better communication through innovation because your message is only as

good as your ability to share it

Fact
Edward Tufte mentions Charles Joseph Minard's map of Napoleon's march to
be the best statistical graphic ever drawn in the world.

Data Visualization

The glorious value of a picture is when it stimulates us to notice what we never


expected to see.

Definition

Data Visualization takes in raw data and transforms into charts, graphs, and
images that can flawlessly marvelously explain numbers to gain insights from it.

EDWARD TUFTE (VOICEOVER): In the arrangement of visualization, every single


pixel should testify directly to content. As Johnny Ive, the great Apple designer, said,
we spend most of our time getting design out of the way. It's got to get out of the away,
because it's about the relationship with the viewer and how they reason about the
content. Style and aesthetics cannot rescue failed content. If the words aren't truthful,
the finest optically letter-spaced typography won't turn lies into truths. There are
enormously beautiful visualizations, but it's as a byproduct of the truth and the
goodness of the information. The big steps in showing information began all with
cartography, about 6,000 years ago, when the first map was scratched into a piece of
stone. And that has wound up now with the most widely seen visualization in the world,
which is Google Maps, where people are using a visualization to actually do something.
The next big step was development of real science. Galileo got his telescope going. He
saw things that have never been seen before. He made beautiful drawings of sunspots,
and he'd watched the sun for about 40 days, and he did engravings of the sunspots. So
he visualized what he saw. And so the history of visualizing data is, very substantially,
a history of science. JULIE STEELE (VOICEOVER): Data visualization is not just some
airy fairy, creative process, but it's actually a very linear process of decision making
that you can do based on some few basic principles. Three things should inform your
design always. One is you, as the designer. What you have to say and what you want
to communicate. Two is the reader. That reader is not you, and they're going to come
with their own context, and their own biases, and their own assumptions, and you need
to account for that. And third is the data itself, and what that has to say, and how that
informs the truth. There's a lot of subconscious brain activity happening. We evolved
for it to happen that way. We evolved to see things and make snap decisions. Are all
those lines in the graph just dried grass, or is that a tiger that's coming to eat you?
[GROWLING] We have to be able to recognize those patterns right away and make
snap decisions on them in order to survive. And that can be an advantage as a
designer. You can communicate a lot of information very quickly, because we all have
brains that are designed to recognize patterns this way. But also, there's the emotional
impact. We react to design, and to art, and to the aesthetics of a piece, just as much as
we react to the information contained in it. And so if you want to change someone's
mind, if you want to change someone's behavior, sometimes presenting the information
in a visual format is the fastest way to get them to engage with that information. JOSH
SMITH (VOICEOVER): Truth is one of those ambiguous things that you can't really
define, and probably changes and evolves, the more understanding you have. Data
itself is a result of research. So I would say that data is just a clue to the end truth. I
think a successful infographic tells a story. It communicates, hopefully, accurate and
sometimes complicated data in a way that many people can understand. I think the first
step, usually, is always dig really deeply into the data ourselves, and find each key
point, and create a hierarchy, and a narrative out of that story. When you start to merge
different pieces of information, and when you start to learn really what it's all saying,
the narrative is clear. The one key fact that everything can revolve around, it's the hero
of the piece. There's one single piece of data or insight that people respond to and kind
of encapsulates the whole vision. And then invite people in to see the nuances and all
of the rest of the story around it. When you look at a piece, it's successful when it
translates data from something that's complicated to something simple. When it
communications a message that otherwise would have taken somebody hours to
digest and find in an instant. JER THORP (VOICEOVER): My deepest interest lies in
the boundary between data and culture. Data are measurements of something. In very
many cases, the somethings that we're talking about are human systems. We're
dealing with data systems that are larger than anything that humans have ever built or
experienced before. And these really large systems, things happen within them that are
emerging. For example, Gate Change combined shot footage from airports, for pretty
much every airport in the world, and then air travel data as well. So the central idea
was to show people that, every time that you're in an airport, you are standing on the
surface of a system that is almost too complex to comprehend. Any given time, there
are more than a million people in the air. And so there's another purpose of data
visualization. There's revelation, which is, show us something that we've never seen
before. This is, for me, much more exciting. Anybody can visualize data in Excel and
see some bar charts. For me, it's about showing them something in this kind of loose
narrative frame that they can interpret. So we show them some pieces of the picture,
and the idea is that they can sort of stand back from that and watch it pass for a little
bit, and come out of it with some deeper understanding. Part of it is leaving it open to
interpretation, but part of it is also not really knowing. I don't have some masterful
understanding of this system that you don't. I have some ideas about how these
systems might be changing, and how they might be growing, and how they may be
important toward culture and society, and I want to share some of those ideas with you.
And maybe you can put together something that I wouldn't have been able to put
together. EDWARD TUFTE (VOICEOVER): I think in general, audiences are a lot
smarter than a lot of people think. So it's not know your audience, it's respect your
audience, and really know your content. That's what you should be knowing and
reasoning about. Look after truth and goodness, and beauty will look after herself. You
want to see to learn something, not to confirm something. We usually see to confirm
things. It's very economizing for the brain. How can we see not to confirm, but to see to
learn? [MUSIC PLAYING]

Prelude
Every phenomenon in the world abides by the standards of its own. It is time to
parade towards the  Principles of Design.

Elements
More than the inspiration, it is necessary to understand the basics of the subject
to create a beautiful design. Before advancing to the principles, we shall halt to
get ourselves braced up by knowing the seven Elements of Design.

Elements of Design

 Line
 Color
 Shape
 Texture
 Value
 Space
 Size

To Know More!
Watch this video to know more on Elements of Design!

This is an auto-generated transcript

hey guys mr. gratzel here today we're going to get into the fundamentals of graphic
design before we can draw and hit the computer we need to understand how to
compose your design and it all starts with the elements of design elements of design
are the building blocks of everything we do in art from drawing to painting to graphic
design even to film the first element of design we're going to look at is line line is a
continuous mark made on a surface by a moving point lines can be used in a variety of
ways they can be thick or thin close together far apart they can be wavy they can be
jagged they can be diagonal or they can be horizontal and vertical they can even be
wavy and move around quite a bit next is the element of shape shapes are an enclosed
space and it's limited to two dimensions so it's flat in nature there are two types of
shapes that we're going to talk about specifically there's geometric shapes which are
your triangles squares circles things you see in your math class or geometry class
hence geometric the other type are organic shapes they're more natural they're
freeform you see them in nature shape of a leaf the shape of saguaro cactus even the
silhouette of a figure all of these are organic and natural they're just kind of freeform
shapes the third element we're going to talk about is space space is the distance or
area between and around or within things that you see in your design in design it can
refer to the white space or empty space around an object this is very important when it
comes to written design brochures and even posters and it can play a lot in the
principles of design which we'll talk about later specifically on emphasis also the
positive or negative space is really important as well positive space is the actual object
itself the negative space is the shape that's created from the open spaces either
between two objects or cut out through a shape the next element of design is texture
texture refers to the feel of an object now in design you really actually feel it but it gives
the illusion of some kind of a texture texture can be created using pattern designs so
repeating shapes and it gives the illusion of a texture or it can actually mimic real
texture like if you wanted to make something look shiny you would you shape the show
the white little shine on a balloon maybe show some shapes that are darker to show a
shadow that's there so it can make it look 3d and realistic and the last element were
going to talk about today is color color is the amount of light that's reflected or refracted
off of an object so light plays a really important role in color color incorporates the hue
the saturation and the value range that we see in colors the color wheel is a valuable
tool to help you decide what colors to use and which going to be the most effective
there are different groupings of color that can be used the first is analogous colors
analogous colors are 2 to 4 colors that are next to each other on the color wheel they
can be anything having a warm feeling or cool feeling you might understand some of
these colors orange and red and yellow kind of have a warm feel like fall or if you think
of hot you think of red they can also have a very cool or calming feeling like blue and
green and purple those colors are all next to each other you think of snow and it's kind
of a bluish tint to it the shadow or the cold water on your faucet which is blue
complementary colors are those colors that are across from each other on the color
wheel orange and blue is an example of color wheels that are across from each other
purple and yellow and green and red so when you think of those you think of Christmas
with green and red and you see a lot of those color combinations and logo designs and
even sports teams because there they work really well together and they grab people's
attention monochromatic is another form of color combinations and that's one single
color but it's the whole value range so it's adding white and black into that one color to
create a variety of mid tones throughout well that's all the elements of design but in the
next session we're going to talk about how to use those elements to compose your
work of art through principles of design thanks for watching guys you

Principles of Design
Balance

It is inevitable to balance the visual elements of the design.

Types of Balance

1. Symmetrical
2. Asymmetrical
3. Radial

Principles of Design
Emphasis
Important data can be emphasized with color, size, or contrast to draw the
attention of users.

Movement

 Movement principle aids in drawing user's focus in a certain direction.


 Implemented in animation and interactive services.

Principles of Design
Smart Use of Patterns

 Patterns are formed when the objects are repeated.


 It helps in displaying the objects that are similar to each other.

Proportion
 Deals with the size of the object.
 It indicates the weights of different datasets and the relationship between
values.

Example:

 Assume a scenario in which you are told to draw a bird sitting on a tree.
You will draw the tree prominent and the bird, smaller in size.
 In a Pie Chart Visualization, the division with 50% will be more significant
compared to the division with 30%.

Principles of Design
Proper Rhythm

 Closely associated with movement.


 Movement of visual elements must be pleasing to the eyes, for it to be
called a proper Rhythm.

Variety

 A critical factor to keep the users fascinated.


 More variety in the visualization can increase the amount of information
that can be remembered by the user.

Principles of Design
Theme
A unified theme assures the harmony of the design.

Prelude
Edward Tufte, an analytical theorist on design and well known for his book on
Design Analysis, has stated six principles. Let us unveil them in this topic.

Edward Tufte's Principles


The six principles stated by Edward Tufte for an effective Data Visualization are
as follows:

1. Graphical Integrity
2. Maximize Data-Ink
3. Avoid Chart Junk
4. Aim for High Data Density
5. Use Classic Design Solutions
6. Apply Aesthetics and Techniques

1. Graphical Integrity
Definition
Visual representations of data must convey the truth.

Measure
Lie\:Factor = \dfrac{Size\:of\: Effect\: shown\: in\: Graphic}{Size of
Effect in Data}LieFactor=SizeofEffectinData
SizeofEffectshowninGraphic

Graphical Integrity
Principles of Graphical Integrity

1. Representation of numbers, as physically measured on the surface of the


graph itself, should be directly proportional to the numerical quantities
represented.
2. Clear, detailed, and thorough labeling should be used to defeat graphical
distortion and ambiguity.
o Write the explanations of the data on the graph itself.
o Label important events in the data.
3. Show data variation, not design variation.

Graphical Integrity
4. In time-series displays of money, deflated and standardized units of
monetary measurement are nearly always better than the nominal units.
5. The number of information-carrying dimensions depicted should not
exceed the number of dimensions in the data.
o Graphics must not quote data out of context.

2. Maximize Data-Ink
Definition

 Data-Ink represents the ink on a graph that aids in representing data.


 Good graphical representation maximize data-ink and erase as much
non-data-ink as possible.

Measure

Data\:-Ink\:Ratio = \dfrac{Data\:Ink}
{Total\:Ink\:used\:in\:Graph}Data−InkRatio=TotalInkusedinGraph
DataInk

It is equivalent to the calculation of 1 minus the proportion of graph that can be


erased without loss of data-information.

An electroencephalogram has a very high


data-ink ratio of  1

Maximize Data-Ink
Principles of Data-Ink

1. Above all else show data


2. Maximize the data-ink ratio
3. Erase non-data-ink
4. Erase redundant data-ink
5. Revise and edit
3. Avoid Chart Junk
Definition

The excessive and unnecessary use of graphical effects in graphs that


are not necessary to comprehend the information but to distract the viewer's
attention.

The word Chart Junk was coined by Edward Tufte.

4. Aim for High Data Density


Definition
The proportion of the total size of the graph that is dedicated for displaying data.

Shrink Principle

Maximize data density and the size of the data matrix  within reason
which is attained by Shrink Principle.

Most graphs can be shrunk way down without losing information.

The human eye cannot visualize circular


distances as accurately as linear distances.
The human eye cannot visualize circular
distances as accurately as linear distances.
5. Use Classic Design Solutions
Classic Design Solutions

 Small Multiples- Series of the same small graphs repeated in a visual.


 Sparklines- Data intense, simple design, word-sized graphics.
 Time Series- One dimension graphs, which are usually horizontal and
the graphics show variation as the time proceeds.
 Micro/Macro Composition- An approach where the visualization contains
enormous details, but an overall pattern emerges.

6. Apply Aesthetics and Techniques


Principles

1. Have a properly chosen format and design


2. Reflect balance, proportion, and a sense of relevant scale
3. Display an accessible complexity of detail
4. Have a narrative quality, a story to tell about the data
5. Draw elements in a professional manner
6. Avoid content-free decoration, including chartjunk

Gestalt Psychology
The principles of Gestalt Psychology will provide techniques that could be used
in the design.

so now we are going to discuss about eight Gestalt principles in details now I'm going
to present these eight principles with lot of examples so it should be easier for you to
understand what they are and what they mean so this eight principles are proximity
similarity closure symmetry common fade continuity good cast alt and figure-ground so
let's start with proximity for example you can see here that all the dots are having
equidistance so you think that this is actually a single object but in the right-hand side
you can see that these dots are divided into three parts the first part is the first column
in the second column of the dots and then the second part is the third column and the
fourth column of the dots and the third part is the fifth column and the sixth column of
the dot so what we analyze from this fact that our mind perceives these objects which
are closer to each other as forming a group so basically it may not be a group but we
think it as a group because of the proximity or the distance they are in in summary the
law of proximity states that when an individual perceives an object they perceive
objects that are closer to each other forming group now let's move on to law of
similarity now this law states that if you perceive an object which are similar to each
other then you think that they are actually trip together for example in this diagram you
can see that there are black circles and then they are white circles you can see that
there is a horizontal line forming from the black circles and a horizontal line forming
from the white circles you will not see the vertical lines because then you will have the
pattern white black white black which is not similar objects but you can see all the white
circles and black circles as they are similar they seems to form a group so this is law of
similarity moving on to the next law which is law of closure so to understand the law of
closure if you see the image on the right hand side you can see that there are three
circles which seems to form a white triangle but in reality there is no white triangle this
is something which we perceive because they are positioned in certain way so our
perception is actually filling the visual gap and even if you see the triangle which is in
the background which we think as a triangle is not a triangle at all is actually three arc
which are just position like three vertex of the triangle and we think that the white
triangle is actually on top of the background triangle so we are not able to see the anti
triangle but in reality we have only three arcs and three circles in the picture on the left
hand side also you can see that there are no circle and rectangle drawn actually but we
perceive that there is a circle and then there is a rectangle so to summarize law of
Closer it says that individual actually perceives object as shape letter pictures etc as
being whole when they are really not so moving on to lofts matauri now what you see
here is you are seeing a square bracket in the first then a curly bracket group and then
again a square bracket so we tend to observe three pairs of symmetrical brackets
rather than six individual brackets it is because we are seeing two brackets as a whole
and not as in individual things or entities so basically the law of symmetry states that
the mind perceives object as being symmetrical and forming around a central point it is
perceptually pleasing to divide object into a even were of symmetrical pattern so that's
very interesting fact to know that our mind is actually very happy to divide the objects
into even number of symmetrical pulse so now moving on to the law of common fate
now to understand the law of common fate I have drawn few circles now if I do a little
animation here can you see few of the circles move to the right-hand side now what
you will think those circles which move to the right hand side are actually a group but in
reality it can be or it can be not so let me do that again so if I come back to the left and
then I go back to the right again so the law of common faith tells us that if an object or
multiple objects are moving in a single line or a single path then we perceive that they
are in a group moving on to the next law law of continuity law of continuity tells us that
we perceive an object to be made up of continuous object any point we see that the
continuity is breaking we don't consider that so here we see that the cross is made of
two slashes instead of a greater than and a less than sign moving on to the law of good
Gestalt so we see an image which is made up of a rectangle a triangle and a circle so
basically what we are seeing is three different object instead of one single object the
law of good Gestalt say that elements of an object seem to be perceived or grouped
together if they form a pattern that is regular simple or odd released so for example if
you are presenting a data and your data is showing a pattern which is actually shown in
this image now user will actually see the pattern in three different distinct said the one
that's a rectangle and the other one is triangle and the third one has a circle this gives
us a good idea about what the user will think instead of what we want to show them so
moving on to the last law which is law of past experience so the law of past experience
implies this under certain circumstances we jewelle stimuli are catalyzed according to
past experience so in this two images you can see one on the left hand side and one in
the right hand side the elements used are similar but because of their position if I put
these two filled circles outside of my main circle then you will really not see anything
important here but once I put them inside the bigger circle with this arc then you will
recognize from your past experience that this is forming a smiley so to sum up the
gestalt principles are these eight key points this key principles provides you with
techniques to be used in your design or data visualization so you can make your data
visualization easily perceivable by your user they can clearly see what you're trying to
convey the message in the first glance or while they are interacting with the data
visualization and also to avoid certain mistakes which your user might perceive in a
different way then what you want to show that

Prologue
It is time to troop towards the next topic of Data Visualization tools.

Neil Gershenfeld said:


Give ordinary people the right tools, and they will design and build the most
extraordinary things.
Tableau

 Tableau is well suited for handling massive and emerging datasets.


 It is used in:
o Big Data Operations
o Artificial intelligence applications
o Machine Learning applications
 It is integrated with advanced database solutions, namely:
o Hadoop
o SAP
o Teradata
o MySQL
 Creates effective visualizations.

QlikView
 QlikView offers powerful visualization capabilities.
 It provides a clean and interactive UI.
 It also provides:
o Powerful Business Intelligence operations
o Analytics
o Enterprise Reporting Capabilities
 Qlik Sense - A package in QlikView that handles data exploration and
discovery.
 QlikView has an active community to guide new users in tool integration.

FusionCharts

 FusionCharts is a JavaScript based visualization package.


 It produces 90 different chart types.
 FusionCharts framework provides a great deal of flexibility.
Plotly

 Plotly supports more complex and sophisticated data visualizations.


 Integrated with analytical-oriented programming languages, namely:
o R
o Python
o Matlab
 Built on top of the open-source d3.js
 Integrated with Salesforce

Sisense

 Sisense provides a platform with full-stack analytical capability.


 Visualizations can be created with a simple drag and drop interface.
 Aids in the integration of data from multiple resources which can be
queried when required.

Others
Other powerful Data Visualization tools include:

 D3.js
 R Charts (ggplot2 package)
 Pentaho
 SAP Lumira
 TIBCO Spotfire
 JasperSoft
 Microstrategy

To Know More!
Check out this video to explore more on Data Visualization Tools.

well folks cutter has released 2017 Magic Quadrant for bi report in this Microsoft
and tableau have the top points in terms of completeness of vision and ability to
execute please remember that x-axis here its completeness of vision and ability to
execute is y-axis so Microsoft has really gained a lot of momentum after success of
Microsoft Azure cloud and power bi reporting tool and look at its comparison from
2016 it was right here table was leader in terms of ability to execute button vision
Microsoft had had a vision last year clasper Gartner one interesting point is look at
the difference between table and table was a leader the ability to execute while
click had advantage on completeness of vision this here table is way way ahead of
both in terms of completeness of vision as well as ability to execute one depressing
point is Alteryx it was a leader it was assumed as a leader in this data mining and bi
space last year but all tricks has come down this year so this was last year 2016 it
was the leader and has come down in 2017 good thing about IBM is it has gained it
was already a visionary it was leader in visionary as well in the last year but this
year it has gained momentum in ability to execute so this is IBM here this year last
year it was even lower on the scale of ability to execute another reporting tool to be
discussed is MicroStrategy it was having its um one of the most talked-about tools
in bi and this year it's kind of lagging but I would say it's just temporary because
Microsoft a MicroStrategy has released its version 10 the latest version of
MicroStrategy and it's it's really promising so I hope it comes as a leader next year
as well now a developing tool and I would say this is a this is potential tool as a
leader for 2018 license I sense is gained momentum and completeness of vision
and I see it gaining momentum as well on ability to execute next year take a look at
last year's eye since this year in 2017 is here in 2016 it wasn't even the
consideration in the consideration of visionaries so this is this is a good news for
sigh sense and I have seen the product client talk about science a lot along with
tableau click and Microsoft power bi another very important tool to keep an eye on
is IBM Watson analytics I mean Cognos has been their Evergreen tool in the BI
space but with the with the launch of Watson ice I think organizations are still to be
matured enough to understand the kind of capabilities IBM Watson api's and
Watson analytics provide on bluemix platform and once the the market or the bi
space get and hold of Watson capabilities and understand and mature at that level I
think IBM is going to be considered again the leaders as it was for so many years
so I'm really hopeful that next to your IBM moves up MicroStrategy moves up
obviously clixsense it's gonna take some time but I'm also hoping click to move up
and tableau and Microsoft even even better so obviously the pace is crowded and
there are some clearly leaders clearly defined leaders in this area and I hope a few
more tools get into this space thank you

Explore these Courses!


1. Tableau: The Sequel
2. QlikView
3. TIBCO Spotfire - Deuce
4. Explore with D3js
5. Data Visualization with R

Prelude
Use of the right technique on suitable data will bring out flying colors of data
representations.

Let us march towards the next destination  Data Visualization Techniques

Right Type of Visualization


This video guides in choosing the suitable visualization technique for the data.

choosing the type of visualization for the data quite often text in numbers by
themselves lack impact and do not express the story behind the data or at least the
story that you're trying to tell they may also not convey the data well if there is a
pattern or some other aspect of the data that you're trying to communicate numbers
and text alone in this case just are not adequate you need to choose graphics and
visuals that match the data and look pleasing to the audience for maximum impact
simple effective graphs use two axes to depict the data using X and y coordinates
typically at least one coordinate is a quantitative value the data must map to two
variables relating to your x and y coordinates in this example amount and time the
amount is on the y axis and it is in degrees Fahrenheit and the time across the x
axis is a particular day graphs are particularly powerful for conveying a trend or
pattern in this type of data for example in this graph we can infer that there is a
warming trend occurring from day 1 to day 6 certain diagrams work particularly well
with particular information or content types for information comprised of changing
data over a time progression a time line diagram is most effective when the
information comprises a guide or a plan to be completed a template diagram based
on the guide or plan is typically called for when you have an ordered set of
instructions to visualize a flow chart is very effective and provides a clear coherent
and structured order that the audience can follow if the content consists of tasks to
be crossed off as they are completed a checklist is a good candidate for visualizing
the information a mind map is a diagram ik technique used to visually present the
linking of ideas a mind map is typically generated around a single topic with the
main topic being central to the diagram major topics related to the main topic are
connected directly to the main with subtopics and other concepts radiating outward
mind maps can incorporate words images numbers and color and can present an
overview of a central topic with large amounts of related information so in this video
you saw how to choose the type of visualization for the data

Charts and Plots


Never can we leave out charts and plots when the topic of Visualization comes
into the picture.
Let us explore a few to know more.

Line Charts

 Line Charts are suitably adopted while analyzing a trend over a period of
time
 They aid in satisfying the need to compare relative changes in quantities
against the time variable.

Bar Plots

 Bar Plots are chosen to picture an observation between cumulative totals


across several groups.

Box Plots
 Five statistically significant numbers are portrayed by Box plots, namely:
o Minimum
o 25th percentile
o Median
o 75th percentile
o Maximum
 It aids in visualizing the range of data and for deriving inferences
accordingly.

Scatter Plots
 Scatter plots help in inspecting multiple variables simultaneously by
color-coding.
 Scatter plots reveal the relationship or association between two
variables; the extent to which one variable is affected by another.

Decision Trees
 Decision Trees are excellent tools that help in choosing the right action
among several courses of actions.
 They provide a highly effective structure to lay out options and
investigate the possible outcomes of choosing those options.

Histograms
 Histograms are used to plot quantitative data, and the ranges of the data
are grouped into bins or intervals.
 Histograms show distributions of variables while bar charts compare
variables.

Prelude
Explore the fruits of visualization here!

A Success Story
In 1854, there emerged a question. The question was:
What is causing cholera epidemic in London?

Data Visualization came as a rescue in this situation.

London Epidemic
The above figure was the one, which saved thousands of lives.

 John Snow illustrated through his visualization that the cholera epidemic
was caused by a bad water pump.
 The red dots in the figure indicated the location of deaths.

Unmask the Mystery!


Know more on the London epidemic in this video.

ALYSSA GOODMAN: In 1854, cholera struck the city of London. Over 600 people died in

just a few weeks. Physician John Snow is often credited with discovering that cholera is

a waterborne disease and with ending the 1854 epidemic by removing the handle from

the contaminated Broad Street Pump. The full story is much more interesting. [MUSIC

PLAYING] So I've been reading about the John Snow Pub for years. There it is. Look at
that. Right here, on what used to be Broad Street, is this fine-looking pub. There's this

very inauspicious sign. It says here, "The Red Granite kerbstone," over there, "marks the

site of the historic Broad Street Pump associated with Dr. John Snow's discovery in 1854

that Cholera is conveyed by water." So that's a really simple summary of the whole John

Snow story. And to give you just an overview, here's the story. So it's 1854. It's really hot

and the end of the summer in London. We're here is Soho, which was a very crowded

neighborhood filled of tenements with people not of a lot of means. And there were a lot

of people crammed into these buildings-- many, many people to an apartment. And it's

hot, and it's sweaty. And the part that you can't imagine, looking right now-- because it's

still pretty crowded; it's not that hot, but it's still pretty crowded-- here, the street now is

completely clean. We see street sweeping vehicles. We see drainage in the streets. We

have sewers. None of that was true in 1854. Instead, the people who lived in these

buildings, some of them had cesspools in these little front courtyards. And they would

take their human waste and other waste and just kind of throw it out the window into the

cesspool or bring it down to the cesspool. And then it would drain wherever it drained.

And the street-- whoa-- was not something that you could easily walk on and keep your

feet clean. Wow, that's slippery and disgusting. And so people were used to this kind of

somewhat disgusting ambiance. And people thought that when there were terrible

outbreaks of disease, especially cholera, that it was caused by the very poisonous air at

this miasma, that they called it. And the smell from all the poo and all the animal waste

was pretty horrible. And it was a pretty sensible theory to think that that could be

causing disease. But John Snow, who lived not far from here-- we'll go there later-- was a

physician in London, who was convinced that cholera, in particular, was a waterborne

disease. It was not carried in this smelly air. And what happened here on this spot in
1854 was that there was a baby who lived at number 40 Broad Street, which would have

been just about here, baby Lewis, who came down with some sort of terrible disease that

caused really terrible diarrhea. And I think pretty quickly, people realized that it was

cholera and that there was going to be another outbreak of this very terrible disease. And

people started dying. So what happens? What happened was the waste from baby Lewis

and other people who became ill with the cholera-- so human vehicle matter-- mixed with

the water supply that was in a relatively shallow well. I think it was about 20 feet down

under the street right here. And then the cholera bacteria began to multiply. And then

people would ingest the water, which provides the human intestines a place where the

bacteria multiplied fast. They need a host like that. So anyway, when people ingest this

contaminated water, they come down with cholera from which you die in the hours to

days. And John Snow, who really was looking for a place where he could, unfortunate

circumstances as they were, have a very concentrated outbreak of cholera where he

could study the origins of the outbreak, was interested in helping the people but also

interested immediately in collecting information about what was going to happen in this

outbreak. So he would have come over here to this neighborhood and started

canvassing all the people around here to see who was dying and who needed help and

what they had done to possibly ingest water from the various water supplies in the area.

And right here, at a location right near this red kerbstone, was a source of what was

apparently some very clean-tasting-- sweet-tasting, actually, water for drinking. Right

here, it's called the Broad Street Pump. And it turned out, as he started canvassing the

neighborhood, that he realized that most of the people who had fallen ill and who were

continuing to fall ill had drunk water from the particular pump. So there were other water

problems in the neighborhood. And one of the things that we'll see later is that, in the
end, when he put all these data together and he made a map, he had to show that this

pump by walking was the closest to almost all of the people who eventually died-- 600-

something people died in the epidemic. So even though there are other pumps that are

geographically potentially nearer to those people, this one tasted good and was close to

the people who died by walking. So in epidemiology today, it's known that one way to

really make your case is to have exceptions to the rules-- so people who should have

died, who didn't die, and then people who did die for no apparent reason. And so John

Snow, in his work, actually found both of those kinds of exceptions. And so one case is

what people sometimes refer to as the people who were saved by the beer. And so these

were the brewery workers who worked a few blocks from here, and drank mostly beer,

and so had a clean supply of things to drink. And the other case was the workhouse that

was near here. And these were the most indigent potentially people in poorest of health.

And so why did they live? They systematically survived this epidemic. And part of the

reason-- the main and the most important reason-- is that the workhouse had a well, had

its own source of clean drinking water. So there are the exceptions of people who should

have died and didn't die. And then what happened was he also found out about a family

who would bring water to a member of that family who had moved far, a couple miles,

outside of this neighborhood. And they brought her some water, an elderly aunt. And she

and her niece drank it. And those people both died. So to us, looking at modern

epidemiological methods, John Snow had plenty of evidence to say that this water from

this well contained the contaminant that caused the disease. But it still took actually

many years until the locals believed his story. And another thing that we should mention

is that John Snow did not collect all this data himself. If you look around the street here,

you'll see that this is a very busy neighborhood with lots and lots of people. And I think it
was even busier in 1854. But it was a small contained community. And there were people,

for example like a curate, Henry Whitehead, who was a local, who kind of knew

everybody. And so John Snow worked with Whitehead and with other people in the area

to really canvas the information and collect the kind of sociological demographic

information that he needed to know who lived where, when they died, where they likely

got their water, who they talked to, who they came in contact with, all kinds of other

details about their personal lives. And so that kind of human data collection, in this

tightly knit community, was also really, really important. [MUSIC PLAYING]

Future of Data Visualization


Listen to this talk on what the future of Data Visualization looks like!

in popular conversation discussions of data visualization often invoke examples


such as these custom bespoke graphics often intended to convey a story and
handcrafted by skilled visualization designers so over my years working in
visualization research my students and I have sought to create tools that enable
these kinds of sophisticated graphics giving rise to popular tools that we've
developed such as Pro to vis Vega and d3 and while we've been very happy with
the success these tools have had I think you know they're only a small part of the
larger ecosystem of visualizations so if you think more broadly the vast majority of
visualizations are not hand coded but rather built with end-user tools often leading
to visualizations that look like this or applications that look like this and while well-
intentioned many of today's end-user tools either have lack consideration of
perceptual principles or fall short of fully supporting the process of visual analysis
and exploration and so when I think about the future of data visualization and how
we chart a path forward I want consideration to think about is how do we move from
tools that work well from designers the tools that really enable analysis and
decision makers to advance their causes to enable better decision-making across
industry and government for example so how can we advance the state of the art
well I think one way is to begin to bake more design smarts into our visualization
tools to give you a sense of what I mean let's just conduct a quick experiment I'll
show you some some shapes I want you to compare them don't yell out the answer
we'll take a quick poll on your response so here's two circles and I want you to
compare their area how much larger is the larger circle than the smaller circle so
raise your hand if you think the big circle is four times bigger so you want to take a
look around okay five six seven eight nine ten a lot of people 11 or more those
others okay now let's try another example compare the length of these two bars
how much larger is the big bar than the smaller raise your hand if you think it's four
times bigger know one five six some more seven even more eight a lot of you nine
much fewer 10 11 or more and almost no one so it turns out the answer in both
cases is the same it's seven times larger but if you looked around the room which
you've noticed is that despite the you know same difference in areas they were
actually much more accurate and less variants overall for comparing length than
area and well it was hardly a scientific poll my group and others have conducted
experiments like this in controlled settings to compare the accuracy of different
encodings so for example you see air rates comparing things such as position
length angle and area and by combining the results of these experiments we can
actually build rankings of visual effectiveness for different encodings for different
types of perceptual tasks so for example when comparing quantities position and
length outperforms things like angle and area which roundly outperform color
encodings and this is useful to one help guide they'll not dictate what human
designers do but also very helpful for guiding algorithms who might automatically
recommend effective and useful graphics so for example at try factor our product
includes support for visual profiling of large datasets here we see a visualization of
a geographic field showing where political contributions come from right now using
a common color encoding across States do you notice the problem well as we just
discussed you know color is one of the least accurate encoding channels for
comparing quantitative data but argue there's even larger problem with this graphic
which is the you know the our perception of value is that correlated with the size
and shape of the states themselves so we can't even see what's happening in
Washington DC which as it turns out as a source of many contributions so what
other visual encodings might we consider so for example we'd like to maintain the
spatial context of the map so positions already spoken for but we could move to
something more accurate like area to enable more effective comparisons but as I
also mention position is one of the most effective encodings so we also augment
this display with bar charts on the left so we can make comparisons in that way and
then link these views together through interaction that's enabling more exploration
allowing us to see more patterns and make comparisons more accurately so baking
in better design smarts into our tools I think is one way we move forward but it's just
a beginning I think more broadly it's time to rethink some of our basic user
interfaces for data visualization for example most tools involve a process of you
know specifying charts how do we move this to a richer process of rapid exploration
so common end user tools will have us select data subsets that we're interested in
and then choose from a set of chart types or specific visual encodings to manually
build up a display but this can be overtly tedious but it also seems like there's a lost
opportunity to recommend interesting views so for example in the trifecta visual
profiler that's we don't have people build charts explicitly we provide them so on the
left for example you see overviews where we automatically present summaries of
all the dimensions within a data set then users can then drill down for more detail
so for example this panel on the right is showing when political contributions occur
this is shown both as an overall timeline as well as summary displays for time
periods such as day of week or month day of month etc and doing this we can see
a number of patterns we see that contributions are more likely they occur with
increasing regularity as we get near an election and they're also more likely on
weekdays or at the end of a month so we learn something useful immediately
without having to specify the charts but probably the most important point here is
that these displays were chosen automatically based on the data so for example
there's very little variation amongst things like hours minutes and seconds so we
don't show those charts to avoid distracting the user or wasting their time you know
simultaneously over in my interactive data lab at the University of Washington we're
exploring new end-user exploration tools so for example our data Voyager system
actually searches over a space of thousands of visualizations and ranked them
according to statistical and perceptual measures to provide recommended
visualizations for a data set and of course the users in the loop as well so for
example you can steer the recommendations and update display based on
indicating fields of interest so you can trying to change the way we interact and
explore with data to enable a broader and more rapid exploration so there's a lot of
challenges that come up and moving from specification to exploration that I'm not
covering and to do this actually requires the combined talents of the strata
community design statistics and machine learning as well as big data systems so
I'm very excited about the challenges that are presented to us but I also think we
have to consider the ways in which these tools will really augment analysis in the
most productive ways and so to illustrate that I'd like to share with you a data set
that I often give to students in my visualization course so that classic data set
originally published in the 1950s comparing the effectiveness of antibiotics which
weren't they at the time new Wonder drugs against a variety of bacterial strains so I
give this data set to the students and ask them to explore the data create a graphic
that answers an interesting question so here's some of the students submissions
and already you can see the great variety and designs that they explore so that's at
first really interesting but one thing that pops out as you start to dig in and you
really inspect these graphics you realize that while they have lots of design
variation they're actually all addressing the exact same question which antibiotics
should one use and so this fixation on this one question reminds me of one of my
favorite Maxim's from the visualization expert Edward Tufte who advises us to show
data variation not just design variation the idea is that while I can be useful to
explore multiple representations of the same subset of data it's often even more
vital to explore different slices and transformations of the data that might spur new
questions or address different hypotheses or rephrasing this I think we should
consider how might our tools better spur us to exercise skepticism about data and
consider new questions so consider this alternative visualization it's fairly simple
what we're seeing is the different bacteria in a scatter plot the bacteria colored by
genus and they're plotted according to the effectiveness of two antibiotics neomycin
and penicillin in this case the lower left corner is a areas of very high resistance and
in the upper right corner here is a very low resistance and the idea here is that by
just by suggesting this relatively simple visualization we may still be prompted to
consider other questions by putting the bacteria front and center we for example we
might ask instead what does antibiotic response reveal about the biology of
bacteria sort of flipping the question if we look back at this chart you might have
noticed something interesting that there clusters of bacteria but they span genus
and perhaps a prising ways like wouldn't a bacteria the same family be more likely
to group together well you'd be right to be skeptical here because there's actually
errors in the data the scientific community was originally wrong and has to miss
classifications in this data set and so it actually took multiple decades after the
original publication of this data for the scientific establishment to overturn these
errors and yet the initial evidence was here all along if we had thought to look at the
data you know through this particular lens so it's interesting to think how our tools
might prompt us to consider our data more broadly and so as we move from
specification to exploration we should also keep in mind how this best enables
analysis for example enabling data variation over design variation and so in
conclusion I'd like to look forward to a future in which our tools don't just help us
build visualizations but help us much more richly explore the data in the end
hopefully leading to better insights and better decisions and so now I look forward
to all of us you'll building this future together thank you

End of the March


John Tukey said:
The greatest value of a picture is when it forces us to notice what we never
expected to see.

Conquer more on Data Visualization to brace up yourselves!

A classic design solution that has intense data, simple design and word-sized graphics is
called ______________.-> Sparklines

Identify the classic design solution which has series of the same small graphs repeated in a
visual.-> Small Multiples

_____________ principle aids in drawing the user's focus in a certain direction.->


Movement

_________________ principle of design ensures that the important data is highlighted


with color, size or contrast.->Emphasis

Representation of numbers, as physically measured on the surface of the graph itself,


should be ______________ proportional to the numerical quantities represented.->
Directly

There are _______ types of Balances in design principle.-> 3

State True/ False. Plotly is integrated with analytical oriented programming languages.->
True

The number of principles stated by Edward Tufte is _____.-> 6

State True/False. Graphics must quote data out of context.-> False

FusionCharts produces ___________ different chart types.-> 90

FusionCharts is a __________________ based visualization package.-> Javascript


The design principle Proportion deals with ________________.-> Size of the object

Good graphical representation ___________________ data-ink and erases as much non-


data-ink as possible.-> Maximizes

__________________ is integrated with Salesforce.-> Plotly

In ___________, ranges of data are grouped into bins and intervals.-> Histograms

________________ are used for analyzing trend over a period of time.->Line Charts

__________________ design principle is closely associated with movement.->Rhythm

__________________ principle helps in displaying the objects that are similar to each
other.-> Patterns

__________________ help in inspecting multiple variables simultaneously by color


coding.-> Scatter Plots

__________ takes in raw data and transforms into charts, graphs and images that can
flawlessly explain numbers in a marvelous way to gain insights from it.-> Data Visualization

Maximizing data density and the size of the data matrix within reason is attained by
_____________________.-Shrink Principle

The excessive and unnecessary use of graphical effects in graphs that are not necessary to
comprehend the information but to distract viewer's attention is known as
______________.-> Chart Junk

Harmony of the design is ensured by _______________ design principle.->Theme

State True/ False. Qlikview is built on the top if D3.js.->False

______________ data visualization tool supports powerful business intelligence


operations, analytics and enterprise reporting capabilities.-> QlikView

_____________________ interface is provided with drag and drop utility.->Sisense

A package in QlikView that handles data exploration and discovery is ______________.->


Qlik Sense

Identify the element(s) of Design.-> All the options(Texture,Line,color&Shape)

Which database solution is integrated with Tableau Data Visualization Tool?-> All the
options(MySQL,Teradata,SAP&Hadoop)
The following are all Edward Tufte's principles of Graphical Integrity, except?-> Show
design variation and not data variation.

You might also like