Courses/Computer Science/CPSC 203/CPSC 203 Template/Lecture Template/Lecture 7
Contents
Housekeeping
Required Reading
The required readings from the textbook fluency ... are:
- Chapter 14. Fill-in-the-Blank Computing. pp 374-406.
- Chapter 15. "What If" Thinking Helps. pp 411 -437.
Also -- it is assumed you are up-to-date on online Lab Manual readings.
Note -- Material from Required Readings may show up on mid-term and final exams.
Introduction
The goal of today's lecture is to tie some of the lessons you have beeen learning in Tutorial on how to design visual displays with the whys concerning the effectiveness of different kind of visual displays. The effectiveness of visual displays is based on the human capabilities and limitations to visual perception.
At the end of this lecture you should:
- Have Reviewed Tufte's 6 Visual Design Principles
- Have examined how these principles apply in the context of 4 simple data visualization examples
-  Be introduced to a Paradigm of Graphic Perception
- Have the paradigm illustrated through several simple charts.
 
Glossary
- Quantitative Data -- Data which can be arranged on a specified quantitative distance scale. E.g. Height in cm., Weight in kg. Position of the data can be interpreted as a specified distance.
- Ordinal Data -- Data which can be arranged in an ascending or descending order. E.g. Small, Medium, Large. Comparisons can be made in terms of order, but not specific distances.
- Categorical Data -- Data in which different labels are not assumed to have any quantitative or ordinal interpretation. E.g. with respect to country of origin: Canada, USA, Mexico
In this lecture, we will focus mainly on the limits to our perception of quantitative data.
Concepts
Some Landmark Folk in the Visual Display of Quantitative Information
-  William Playfair --- http://en.wikipedia.org/wiki/William_Playfair
- Scottish Engineer and Political economist late i8th century
- Gathered, and improved many of the common visual patterns of his day and essentially created the field of "information Graphics".
- Incorporated principles from ART -- e.g. Golden Ratio.
 
-  John Tukey -- http://en.wikipedia.org/wiki/John_W._Tukey
- 20th century statistician/mathematician
- essentially distinguished "Confirmatory" and "Exploratory" statistics.
- Confirmatory statistics tests hypotheses; whereas Exploratory statistics generates hypotheses by looking for pattern and structure in data.
- emphasized graphic methods are importent for genration hypotheses, and developed a number of graphical techniques.
- with colleagues, looked both created new data graphics, and revised many existing data graphics.
 
-  Edward Tufte -- http://en.wikipedia.org/wiki/Edward_Tufte
- Key modern proponent of good graphic design in the visual display of information.
- Integrated both the older work of Playfair and newer work by Tukey and colleagues in exploratory data analysis to define a small set of principles for the visual display of information
 
Visual Display of Information
Cross Reference: see 203 Lab manual: [Charts And Visual Design Rules]
Two Critical Principles in the Visual Display of Information are:
- Statistical Accuracy (the numbers are the "right" numbers, correctly calculated given the data population/sample you are using).
- Cognitive Effect (the pattern in the data is made clear as possible to the viewer).
Design Issues in the Visual Display of Information (or the World According to Tufte)
- Maximize Data Ink -- Ink that directly conveys information about data points
- Minimize Chart Junk -- All additional glyphs, bells, whistles, 3D effects that do not directly convey data information.
- Use Small Multiples to deal with Complexity -- Create a basis for comparison in large or complex data sets by creating simple diagrams with common axes or common design elements. Example: http://en.wikipedia.org/wiki/Small_multiple
- Data Density -- Very large data sets or very complex data sets require us to find visual techniques that maintain the content of the data, but allow us to get a "gestalt" view that can not be obtained from reading a massive data table.
- Multiple Use -- If possible put visual elements to multiple uses. Data points, could also be numbers reflecting data values. Data glyphs could reflect relationships between the data attributes in frame, and other data attributes.
- Aesthetics -- The same principles that make various art constructs effective apply also to visualization of data. Example -- use of the "Golden Rectangle" for 2 D displays. http://en.wikipedia.org/wiki/Golden_rectangle
Bad Chart Examples
http://j-walkblog.com/index.php?/weblog/posts/bad_charts/
http://lilt.ilstu.edu/gmklass/pos138/datadisplay/badchart.htm
Good Chart Examples
http://lilt.ilstu.edu/gmklass/pos138/datadisplay/sections/goodcharts.htm
http://www.compassgr.com/sites/mark/index.htm
Visual Display of Information Examples
We will explore the use of the above principles through four examples of Visual Display in the context of Exploratory Data Analysis (EDA).
- US Supreme Court Votes (categorical data)
- DDT and the Bald Eagle Population (quantitative data)
- Homicide and Temperature (quantitative data)
- Smoking and Insurance Premiums
In each case, think about the following:
- What data is being highlighted?
- Is it raw data or transformed data?
- Which of Tufte's principles are being applied?
Graphical Perception Paradigm
Elementary Graphical Perception Tasks
This material is from:
- Cleveland, W.S. 1985. The Elements of Graphing Data. Wiley.
- Angle
- Area
- Color hue
- Color Saturation
- Density (amount of black)
- Length (Distance)
- Position along a common scale
- Position along identical, nonaligned scales
- Slope
- Volume
Graphical Perception Hierarchy
When experiments are done on these tasks, it is found that some elementary tasks are more accurately done by humans than others. The elementary tasks, listed from most to least accurate are:
- Position along a common scale
- Position along identical, nonaligned scales.
- Length
- Angle -- Slope
- Area
- Volume
- Color hue -- colour Saturation -- Density
The degree to which information in a graphic is encoded via items higher on this list, the easier it is for us to decode the graphic.
Cleveland advises, encode data on a graph so that the visual decoding involves tasks as high as possible in the ordering.
In practical terms:
-  converting slope judgements to position judgements increases accuracy of perception
- slope judgements are often contaminated by length. That is, 2 sets of lines may have the same ratio of slopes, but appear to have different slopes due to their lengths.
 
-  converting length judgements to position judgments (along common or aligned scales) increases accuracy of perception. 
- corrollary 1: stacked bar charts often hinder perception relative to simple bar charts (or dot charts) that allow position judgements.
- corrollary 2: stacked bar charts often hinder perception relative to simle line charts that allow relative position judgements.
 
-  Angle judgements are particularly inacurate, and it is difficult to order a set of angles if their magnitude is similar. 
- corrallary 3: Pie charts are best replaced by simple bar charts.
 
- Area judgements are confounded by perimeters. Shapes with large perimeters appear to have greater area.
Summary
- There exists a small set of design principles for the visual display of data
- These principles ultimately rest upon our human abilities of graphic perception
-  Combining Tufte's 6 design principles, with our knowledge of which graphical perception tasks are most accurately done by human beings allows us to meet our two key objectives in the visual display of information:
- Statistical Accuracy (the numbers are the "right" numbers, correctly calculated given the data population/sample you are using).
- Cognitive Effect (the pattern in the data is made clear as possible to the viewer).
 
Text Readings
see "Required Readings" above
Resources
Cleveland, W.S. 1985. The Elements of Graphing Data. Wiley.
Wainer, H. 2005. Graphic Discovery.

