Courses/Computer Science/CPSC 203/CPSC 203 2007Fall L04/CPSC 203 2007Fall L04 Lectures/Lecture 7

From wiki.ucalgary.ca
Jump to: navigation, search

Lecture 7

We do a "visual review" of basic statistics, and how they are best used. Then we relate these statistics to their matching functions in a spreadsheet.


Last Week's Homework

Look up the basic functions and formats for:

  • Statistical Data Summarization: Summation, Mean, Standard Deviation, Mode, Median, Variance.
  • How "If Then" functions work.
  • How "Lookup" works.
  • Can you figure out the function to combine two pieces of text????


The objectives of today's class are:

  • House Keeping
    • Small Change to Thursday's "Design Principles" section.
    • What do you need to know for lab quiz next week (redux)
    • Update on Practice Quizzes and Lab IT
    • Assignment 1 Quick Review:
      • The Assignment
      • The associated spreadsheet will be on BlackBoard tonight.
      • You can use Excel, Open Office Spreadsheet, or Google Spreadsheet & Docs. BUT -- must save the file as .xls format
    • Homework from last Thursday



Lecture Glossary

Know both the meaning of these statistics, and how to access them in a spreadsheet:

  • Mean - the 'centre' of a set of values, aka the 'Average'
    • =AVERAGE(Cell:Cell)
  • Median - the middle value in a set of values
    • =MEDIAN(Cell:Cell)
  • Mode - The most frequently occuring value in a set of values
    • =MODE(Cell:Cell)
  • Standard Deviation - a statistical measurement of the spread of its values on either side of the mean
    • =STDEV(Cell:Cell)
  • Count - a function that counts the amount of data values
    • =COUNT(Cell:Cell)
  • Sum - adds all the numbers in a range of cells
    • =SUM(Cell:Cell)
  • Min -- minimum in a set of values
  • Max -- maximum in a set of values
  • Range -- Max - Min
  • Precision The limits of our measuring instruments. The "box" within which all observations appear equal.
  • Scattergram -- A 2 dimensional display of data points on an X-Y plane.
  • Cartesian Plane A rectangular coordinate system that associates each point with a pair of numbers. The basis of Scattergrams. See http://dl.uncw.edu/digilib/mathematics/algebra/mat111hb/functions/coordinates/coordinates.html

This lecture is focussed on giving a visual introduction to these statistics. Note: the Mean, Median, Mode are all measures of 'location'.

A Visual Introduction to Statistics

  • Our data 'lives' in the Cartesian Plane (or extensions therof).
  • We can represent this in 2D as a 'Scattergram
  • The Min' and Max set the boundaries for where data resides in our Scattergram.
  • The 'cells' in the Scattergram reflect the Precision of data.
  • The 'cell' with the most data is the Mode
  • The Mean and Median are two ways of estimating where data is most frequent in a Scattergram, i.e. the central location
  • The Standard Deviation is a measure of how data varies around the Mean. It can be imagined as an ellipse drawn in a Scattergram.
  • We can imagine a 'multivariate' data set as a matrix of scattergrams.


We will follow up today's visual intro on Thursday with an introduction to Charting, and the principles by which we make our Scattergram (and other charts) both visually appealling, and accurate in their communication of information.


Resources

See any "basic" statistics text for review, if you are unfamiliar with any of the statistical terms covered today.