Courses/Computer Science/CPSC 203/CPSC 203 2008Winter L03/CPSC 203 2008Winter L03 Lectures/Lecture 11

Lecture 11
In this lecture we introduce some basic definitions and concepts related to databases. In particular we focus on a triad of ideas:


 * 1) In A Database --- Everything is Abstacted to 2-D tables. ("It's Tables all the way down").
 * 2) A table represents is a set of values for some object -- identified by the Primary Key. Each row in a table specifies a set of attributes for an "instance" of an object (an individual).
 * 3) All operations in a database are operations on sets of values.


 * House Keeping
 *  MIDTERM: Tuesday February 26th  -- Will be up on BB for next Tuesday's class. An answer key will be available in CPSC office.
 * READING: Chapter 11: Databases and Information Systems (see TEXT READING section at bottom of today's lecture)
 * TBA1 Results: L03 students Averaged 7.7/10. Approximately 20 students had perfect scores. Well Done!
 * Skills List for TBA2 available on Wiki. A small "Books" database that illustrates the skills available in Course Documents section of Blackboard.
 * Group Projects: Should be well into developing your arguments.
 * "Who Are We In Excel" -- consider showing your preliminary analysis to your TAs next week so they can give you feedback.


 * Todays Topics
 * Review the Meta model for a relational database; as well as relational DB origins.
 * Introduce New Vocabulary for databases.
 * Look at the Notion of Tables as Sets (review of Venn Diagrams).
 * First look at "Books" Database

Review and Extension from Lecture 10

 * Table - A row (case) by column (variable) display. An entity that has a group of related records. In Relational Databases, a table if often called a "Relation". It is also called an "Entity".
 * Parent: A Parent-Table or Entity has it's primary key linked to one to many foreign keys in the Child table.
 * Child: A Child-Table or Entity has a foreign key(s) that link it back to one or more Parent Tables. The child tables foreign keys are primary keys in the parent tables.
 * Domain – “Data Type” = accepted values and operations. A set of values of a specific type with allowable operations that can apply to many attributes. Every field or column must be assigned a data type which is a domain (with specific rules) such as:
 * Text
 * Numeric
 * Integer
 * Date
 * Hyperlink
 * Attribute – a feature of an entity (a "variable")
 * Entity – an object in the world, which can have many relationships with other entities
 * Relationship – Intersection set of keys for 2 tables. A link between two entities.
 * Join – the relations between entities or Parent Table (on primary key) and Child Table (on foreign key)
 * One-to-One: Ex: a student can only borrow one book
 * One-to-Many: Ex a student can take many courses
 * Many-to-Many: Ex: many students can take many courses. Joins can break down into multiple One-to-Many joins

Additional Database Concepts and Terminology

 * Query – A way of asking questions of a database. In general a query of a table(s) returns a subset of the table(s) for which the query parameters are true. For example, which people in the class listed Google as their most frequently visited website. More formally, a query is a proposition about data extracting a subset of information based on specific requirements using the primary and foreign keys. A query language is used to set up these requests and  the query can be repeated at different times for updated information
 * Key – a unique identifier: data value that makes the row of information completely different from any other row
 * Primary: primary unique id of an instance (case)
 * Foreign: key linked to Primary key of parent table
 * Index - A look up table for a column with repeating values that makes queries on this column more efficient.
 * Referential Integrity -- Put simply, the restriction that no entry in a child table can be made unless the foreign key for that entry matches an existing primary key in the parent table(s). See Wikipedia: http://en.wikipedia.org/wiki/Referential_Integrity
 *  Entity Relationship Diagram (ER-D)  -- A diagramatic summary of the tables in a database and their relationships. We use a simple version of ER-D in this course -- for more information see http://en.wikipedia.org/wiki/Entity-relationship_diagram
 * Cardinality (of a set). A measure of the number of elements in a set. If set A had 4 elements, we would express its cardinality as |A| = 4. See: http://en.wikipedia.org/wiki/Cardinality
 * Cartesian Product  Muliplying each element in one set by every element in another set. In Relational Databases, a cartesian product is the default result of including two tables in a query without a join. Essentially, it is multiplying the cardinality of each of the tables. For those interested in the math side of this see, http://en.wikipedia.org/wiki/Cartesian_product

From the Relational Database MetaModel to Tables
We will cover the triad of core concepts:


 * 1) In A Database --- Everything is Abstacted to 2-D tables. ("It's Tables all the way down").
 * 2) A table represents a set of values for some object -- identified by the Primary Key. Each row in a table specifies a set of attributes for an "instance" of an object (an individual).
 * 3) All operations in a database are operations on sets of values.

largely using visual models and a small 'Books' database example.


 * The Relational Database Meta Model -- tells you 'what' a database is in the abstract. It is usually implemented as a set of tables. That is, there is an underlying database (from the meta-model) that keeps track of every specific database you build.
 * More practice on thinking about relational databases in terms of sets.
 * A first look at a very simple relational database on Books.

TEXT READINGS
TIA 3rd Edn: Chapter 11. 462 -- 503

TIA 4th Edn: Chapter 11. pp 484 -- 525

Resources
E.F. Codd's original proposal for the Relational Data Model can be found online at: http://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf  "A Relational Model of Data For Large Shared Data Banks"  created the theory behind the modern databases used in large organizations today. While technical -- the introductory parts of the article are accessible to the general lay reader and provide a good introduction to the thinking style needed to "Grok" databases.


 * The Database Relational Model. A Retrospective Review and Analysis. 2001. By C.J. Date
 * Practical Issue in Database Management -- A Reference for the thinking Practioner.2000. By Fabian Pascal
 * The Essence of SQL. A Guide to Learning the Most SQL in the Least Amount of Time. 1996. By David Rozenshtein
 * SQL Visual Quickstart Guide. 2005. By Chris Fehily.