Courses/Computer Science/CPSC 203/CPSC 203 2007Summer L60/CPSC 203 2007Summer L60 Labs/Lab 5

Mini-Tutorial: Data Analysis Design
The following notes offer a disciplined approach to developing a data analysis, and is particularly suitable to the case where much of the data organization and intermediate data processing is done via queries in a relational database.


 * 1) Begin with the End in mind and work Backwards. What is the final goal of your analysis?
 * 2) What are your data sources and how do they need to be organized to achieve your analytical goal.
 * 3) What statistics and calculated variables do you need to use, including interim calculations.
 * 4) Draw a "path" from source data to final analysis and use it to,
 * 5) Break pieces of the analysis down into small steps (in a relational DB each of these steps could be a query, so one ends up with a series of queries that correspond to each step in the analysis).
 * 6) Check the accuracy of each step of the analysis by
 * 7) Confirming the data coming into that stage of the analysis
 * 8) Confirmin the data exitting that stage of the analysis
 * 9) in particular the entering and exiting data in each stage must be in the form, and range that you expect.