Courses/Computer Science/CPSC 601.29.ISSA.F2011

From wiki.ucalgary.ca
Jump to: navigation, search

This is the course wiki for CPSC 601.29 (Information Systems Security Analysis) for Fall 2011.

Course Logistics

When: WedFri 10AM to 11:15AM

Where: Wed SS 115, Fri ICT 616

Exam: No final exam

Communication: Use the class mailing list

Course Description

This course focuses on the principles of analyzing, penetrating, and defending computer systems. This subject complements a course of study that examines the theory and practice of securing computer networks. It is appropriate for graduate students or advanced undergraduates who want to learn fundamental concepts in security architecture and tools for computer system attack and defense. The course begins with a brief review of assembly programming and operating systems internals. For concreteness, concepts are demonstrated relative to the x86/Linux platform (and Windows, Solaris, or OS X as appropriate).

The instructor will cover topics including shellcode disassembly, memory protection, debugging, sandboxing (isolation & virtualization), reverse engineering, and intrusion recovery. We stress to students that this course is not solely a How-To training guide for a particular tool chest. This course relies on underlying principles for thinking about how systems can be made to fail, and its central aim is to help students understand the following abstract concepts:

  • cross-layer interactions -- root of trust; hardware supporting software security
  • composition and trust -- how these concepts affect system assurance
  • execution analysis -- how to analyze programs by reversing or removing abstraction, encapsulation, and other system organization principles
  • flaws as programming models -- understanding vulnerabilities and exploits as de facto primitives of an unintended programming environment
  • countermeasure efficacy -- understanding the context and relative merits of protection measures

The course will start with an overview of the ethical considerations involved in adopting a security analysis mindset. Additional ethical considerations will be introduced as necessary. Students will be required to adhere to the Agreement and Ethical Statement documents.

Course Syllabus

This course provides an overview of system instrumentation techniques related to the analysis of the security properties of running code. The course will cover concepts in such analysis as well as some practical tools and related literature from the application of these tools and concepts in academic research.

The course will cover a selection of the following topics as time allows:

  • The Security Mindset: Principles of the ``Hacker Curriculum
  • IA-32 Architecture Overview
  • IA-32 Hardware Support for Security
  • x86 Assembly: Assembly programming, ELF toolchain
  • Common Vulnerability Classes
  • Operating Systems Security
  • Code Injection Countermeasures: Control Flow Integrity, Automated Diversity, Other Techniques
  • Shellcode Disassembly: Basic Tools and Samples, Advanced Techniques, Polymorphic Shellcode, English Shellcode
  • Program Supervision: Basic Mechanisms, Frameworks
  • Isolation: Isolation Primitives, Isolation Systems, Virtualization and Security
  • Reverse Engineering Protocols and File Formats, Reverse Engineering Protected Code
  • Intrusion Analysis: Classic Examples and the Invention of Honeypots, Intrusion Recovery
  • Language-theoretic security

Course Schedule

This lists the dates and topics for each day along with the reading assignments.

14 Sept 2011: Introduction (no class, read on own)

Prof. Locasto is away at a conference. Class will be rescheduled or appended to another session. Our first meeting is Friday, 16 Sept. at 10:00AM.

16 Sept 2011 : Introduction and Security Definition

Topics

  • Offer an operational/constructive definition of security and trustworthiness
  • Discuss the semantics of the word "hacker"

Readings

Sept 21 An Introduction to x86

Topics

  • The "Big" Picture: Instrumentation levels
  • x86 fundamentals

Reading Assignment

Sept 23 Layers of Abstraction

Today we discussed the general concept of breaking through layers of abstraction. Our muse was the general question "How do I know what my process is doing?"

We covered a concrete example by writing a "hello, world" C program and examining the results of a compile to assembly, a compile to binary (ELF), and executing the program under strace. We also put into practice some of our knowledge about x86 assembly code.

Sept 28 "Instrumentation" Picture

Today we discussed the big picture in terms of developing the skill necessary to instrument a system at many different layers of abstraction. We started off class discussing the "big picture": the major difference (and relationship) between static analysis of the properties of binary images and dynamic analysis of running programs (i.e., processes). The production and processing of each artifact involves a number of other systems, including IDEs, the lexer/parser/compiler, assembler, linker, and the OS loader. Each of these introduces a location for instrumenting the system.

We tried to define the term "system".

We particularly discussed the issue of the semantic gap and how the concept of Deep Introspection might help address this gap by placing efficient data extraction and aggregation primitives at different layers of the system stack.

We hinted at various tools and techniques for doing so, in both the static and dynamic views of the system.

Sept 30 System Call Interface via Custom Assembly

Today we discuss the system call interface and how to invoke system calls via assembly code. We introduce the use of nasm and some other disassembly tools (udcli, objdump -d, and disasm).

We also discussed academic system security conferences.

Finally, we discussed class project requirements. Your midterm exam will be an oral presentation of your progress on the project. Your literature review will be either related to the project or (if you choose to do the bug-finding exercise) a survey of 5..10 papers on a specific systems security topic.

Links

Reading

Oct 5 Shellcode Sample

In this course, we will gain familiarity with x86 code, how the x86 chip executes such code, how assembly code can be crafted into "shellcode", and how that shellcode interacts with the OS. We will ask:

  • What is "shellcode"? Is it inherently "evil" or malicious?
  • How does shellcode interact with the OS?
  • What is the system calling convention?
  • How does shellcode get executed?

http://www.shell-storm.org/shellcode/files/shellcode-606.php

Oct 7 Programming in Assembly

In this class, we will have a hands-on exploration of how to write very small assembly programs. This material reinforces what we learned about shellcode and how to use the system call interface directly, without involving loads of compiler-inserted code and libraries.

This lesson is based on the very excellent cross-layer article about making very small ELF files:

http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html

This article, besides being a great tutorial on x86 and the Linux system call interface, is a fantastic exploration of the ELF file format.

Oct 12 Hardware Support for Trapping, Debugging, and Protection

In this class, we will consider some of the hardware support available for protection under x86. These mechanisms are the primitives that higher-level security and protection mechanisms depend on. Take careful note of their cost, granularity, and ability to express higher-level semantics (or lack thereof). This class session will also explore how trapping occurs on x86.

Oct 14 Polymorphic Shellcode

One recent (depending on how you count years) shift in attack and defense is the shift from trying to detect malicious code to detecting malicious computation; the former is hard, and the latter is impossible. Score one for the ba^h^Hcreative guys.

In this class we will cover a polymorphic shellcode example as well as English Shellcode. You should read the following articles before class:

Oct 19 Control Flow: Calling Conventions

This session will discuss x86 calling conventions, and how mixing control data and normal data in the same contiguous memory location entails risk.

This is preparation for understanding basic stack-based buffer overflows, but the larger lesson is that anywhere control information and pointers are mixed with writeable data, you have an opportunity for employing a "write primitive" as an attacker.

More importantly, this demonstrates how x86 systems fail to take advantage of segmentation support to differentiate between different types of memory. Systems need support for fine-grained separation of memory segments that can be efficiently enforced.

Readings

Oct 21 Program Supervision Basics (ptrace)

We will particularly look at how ptrace(2) is implemented and can be used by one process to trace another. We will pay particular attention to how ptrace support is expressed in the kernel.

Oct 26 Discussion of ptrace implementation (cont.)

Today we continued our discussion of the implementation of ptrace.

Oct 28 Implementing a Small Debugger

In this class session we will show how to implement a small, special-purpose debugger.

A tarball of the code from class:

snyfer ptrace-based syscall tracer

Nov 2 Introduction to GDB

In this session, we looked at some basic usage and commands in GDB.

Nov 4: Debugging Session of a Basic Code Injection Attack

Reminder: class will be in ICT 643, not ICT 616.

Today we will use gdb to look at a simple strcpy-based injection. Your machine likely has a number of countermeasures to basic code injection in place already, and performing basic exploit research to understand the basic concepts (e.g., those presented in "Smashing the Stack for Fun and Profit" http://www.phrack.com/issues.html?issue=49&id=14&mode=txt ) requires you to turn them off to remove some complexity. The idea was to shut off all protections (execstack, turn off ASLR, turn off stack protector):

  • compiling programs with fno-stack-protector
  • turning off ASLR: as root, `echo 0 > /proc/sys/kernel/randomize_va_space'
  • marking executables as needing executable data areas: `execstack -s a.out'

write a small program that uses strcpy(3) unsafely:

http://pages.cpsc.ucalgary.ca/~locasto/teaching/2011/ISSA/code/scopy.c

#include <stdio.h>
#include <string.h>
int do_work(char* src){
  char dst[10];
  strcpy(dst, src);
  return 0;
}
int main(int argc, char* argv[]){
  if(2==argc){
    do_work(argv[1]);    
  }else{
    fprintf(stdout, "./scopy [arg]\n");
    return -1;
  }
  return 0;
}

and then proceed to look at the stack using GDB.

http://pages.cpsc.ucalgary.ca/~locasto/teaching/2011/ISSA/code/session.txt

We also looked at how to examine the stack through a C pre-processor macro: Courses/Computer Science/CPSC 601.29.ISSA/20110307CodeSession


Classic Reading

Nov 9 Analysis of a Real Vulnerability

Using gdb, this session we will take a guided tour of the operation / execution of a real exploit on a real (but old) vulnerability. While the specific type of vulnerability is less likely to be a problem or easily exploitable for most current commodity systems, the principles involved are illustrative from both an attack and defense perspective.

  • Topics: the point of this lecture is to provide a hands-on assessment of an old stack-based buffer overflow vulnerability and the resulting exploit. This takes the previous class and cranks it up a notch: the vuln is in real, widely used software, and we get a richer address space than just a simple program copying a string from argv[1]
  • Notes: In this class, we looked at a wealth of information in both the source code (at the src-level definition of various functions) and at the assembly level. We first observed how rpng-x reacted when fed the proof-of-concept test case to exercise the vulnerability. We observed some output error messages and then a segfault. We started rpng-x in gdb and began to place breakpoints from the png_handle_tRNS() function in the `pngrutil.c' file. We placed breakpoints at different related functions (e.g., png_crc_finish, png_read_data, png_default_read_data) and looked at them both at the source level as well as disassembly within gdb. We were able to modify the PoC exploit to include shellcode that spawned a shell once we got the return address correct (via stack examination of where the NOP sled landed on the stack). The key problem seemed to be that the length of a fixed-sized local buffer (256 declared bytes) -- and specifically the amount of space allocated on the stack (300 bytes) was smaller than the 512 bytes of data that fread(3) ultimately stuffs into that fixed-sized local buffer.
  • Links

Nov 11: Reading Week (no class)

Reading week. Remembrance Day / Veteran's Day. University is closed Nov 10-13.

Nov 16: Heap-based Overwrites

Nov 18 Midterm Presentations

Today you will update the class on your progress with your project. You have 20 minutes. I'd like to see:

  • background (problem motivation)
  • the "gap" (specify what is wrong with the world that you'd like to address)
  • your approach
  • any related work
  • current state of implementation or investigation
  • any preliminary or current results
  • questions from the audience

Nov 23 Intrusion Analysis

Today we'll discuss some classic work in intrusion incident analysis and look at the debris left over from a real intrusion, discussed in my LISA 2009 paper:

Stories about post-mortem analysis of such incidents are rare. Here are a few links and pointers:

Nov 25: no class (US Thanksgiving)

Prof. Locasto will not be in Calgary.

We will double up the Nov 9 class. [done]

Nov 30: Countermeasures (basic and advanced)

Advanced Countermeasures

  • Topics: tainted dataflow analysis, Artificial diversity, control flow integrity
  • Notes
    • Artificial diversity
      • n version programming
      • anomaly detection based on system call sequences: "A Sense of Self for Unix Processes" Somayaji et al.
        • asynchronous system calls
        • convergence of a profile & calibration
        • mimicry attacks
        • looking at arguments, not just syscall sequences (UCSB work)
      • ASLR (last time)
      • Instruction Set Randomization (Barrantes et al., Kc et al., CCS 2003)
        • mysql randomization, postgresql randomization
      • program reproduction by Somayaji et al.
    • Tainted Dataflow Analysis
      • TaintCheck (CMU)
      • mark sources of input (i.e., from system calls such as read(2) and recv(2)) with a tag, propagate tag with each assembly instruction
      • attack detection conditions
        • tainted data enters the contents of EIP
        • EIP points to an address containing tainted data
        • tainted data enters a system call argument, such as a parameter to execve(2)
    • Control Flow Integrity:
      • model control flow graph of an application statically; new injected control flow paths should not deviate from this static "complete" model; CFI paper
      • XFI: software guards
    • Advanced Topics: Signature Generation, Recovery
      • STEM
      • FLIPS, Shadow Honeypots, DIRA
      • Vigilante
      • VSEF: Vulnerability Specific Execution Filters
      • Application Communities
      • ShieldGen
      • Bouncer
      • Rx: execute through errors; see bugs as allergens, slightly change the environment and re-execute
      • ASSURE: model control flow, recover to a "safe" state
      • SEAD: Speculative Execution for Automated Defense: have an execution monitor that interprets "repair policy" (basically an after-production exeception handling mechanism) when a code injection attack is detected
  • http://technet.microsoft.com/en-us/security/dd285253.aspx (Mitigations by Matt Miller)
  • Code
    • We wrote a small piece of code to gather data on the value of ESP (position of stack in a randomized (i.e., ASLR environment) or non-randomized environment.
#include <stdio.h>
int main()
{
  int* x = 0;
  int y = 0xDEADBEEF;
  x = &y;
  fprintf(stdout, "%u\n", ((unsigned int)x));
  return ((int)x);
}

This produces a distribution of values showing limited (max 3 in our experiment) reuse of the same stack state. Note that the return of x from main is truncated by the shell to a short int. We wrapped the binary in a bash shell script:

#!/bin/bash
for ((;;))
do
      ./a.out >> newstack.dat
done

and then passed newstack.dat through a chain of commands, first sorting -n the output, then passing to uniq -c, then awk to print $2 then $1, then sorting -n again. We directed that final output to a file newstack.sorted and plotted it with gnuplot (plot 'newstack.sorted' with impulses).

Turning ASLR off (via `echo '0' > /proc/sys/kernel/randomize_va_space produced a file with the same value (322122588).

Dec 2 : Language-Theoretic Security

Last day of class. Literature Reviews and Projects Due

Topic: we will watch a video talk by Len Sassaman and Meredith L. Patterson related to their 2010 ph-neutral talk (link below)

Dec 7: No Class (held extra on 16 Nov)

Prof. Locasto will be at ACSAC 2011 in Orlando and LISA 2011 in Boston.

We will double up Nov 16's class on heap overflows. [done]

Dec 9: No Class (held extra on 23 Nov)

Prof. Locasto will be away at a conference.

We will double up Nov 23's class on Intrusion Analysis. [done]

Links