Courses/Computer Science/CPSC 203/CPSC 203 2007Summer L60/CPSC 203 2007Summer L60 Lectures/Lecture 3

Basic Statistics Glossary
Know both the meaning of these statistics, and how to access them in a spreadsheet:
 * Mean - the average of a set of values
 * =AVERAGE(Cell:Cell)
 * Median - the middle value in a set of values
 * =MEDIAN(Cell:Cell)
 * Mode - The most frequenly occuring value in a set of values
 * =MODE(Cell:Cell)
 * Standard Deviation - a statistical measuement of the spread of its values on either side of the mean
 * =STDEV(Cell:Cell)
 * Count - a function that counts the amount of data values
 * =COUNT(Cell:Cell)
 * Sum - adds all the numbers in a range of cells
 * =SUM(Cell:Cell)

Lecture was focussed on giving a visual introduction to these statistics.

Visual Display of Quantitative Information
Our goals are two fold:
 * Maintain statistical accuracy (our analyses are valid).
 * Have cognitive impact (i.e. the pattern in the data shines through)

Some basic principles for the visual presentation of quantitative information (from Edward Tufte)


 * Data Ink (the ink on the graphic should focus on bringing out the data)
 * Chartjunk (any ink that is not directly focussed on bringing out the data)
 * Data Density (large amounts of data for a single variable, or data across numerous variables -- we can't take the numbers in easily without pictures).
 * Small multiples and complexity -- using a standardized frame, we can often deal with data density through multiple comparisons; example, a series of scattergrams comparing Y variables against a common X variable)
 * Aesthetics. Basic design principles for visual information as they have developed through time in art -- example the golden ration used to define X and Y axes. See: http://en.wikipedia.org/wiki/Golden_ratio

Resources
Edward R. Tufte. 1983. The Visual Display of Quantitative Information. See also the Tufte Web Site http://www.edwardtufte.com/tufte/

The Elements of Graphing Data. 1985. By William S. Cleveland.

Information Dashboard Design. The Effective Communication of Data. 2006. By Stephen Few.