Courses/Computer Science/CPSC 355.W2014/Lecture Notes/Scribe
Contents
- 1 CPSC 355 Scribe Notes
- 1.1 Jan 13
- 1.2 Jan 15
- 1.3 Jan 17
- 1.4 Jan 20
- 1.5 Jan 22
- 1.6 Jan 24
- 1.7 Jan 27
- 1.8 Jan 29
- 1.9 Jan 31
- 1.10 Feb 3
- 1.11 Feb 5
- 1.12 Feb 7
- 1.13 Feb 10
- 1.14 Feb 12
- 1.15 Feb 14
- 1.16 Feb 17, 19, 21: Reading Week, no class
- 1.17 Feb 24 (guest lecture)
- 1.18 Feb 26
- 1.19 Feb 28
- 1.20 March 3
- 1.21 March 5
- 1.22 March 7
- 1.23 March 10
- 1.24 March 12
- 1.25 March 14
- 1.26 March 17
- 1.27 March 19
- 1.28 March 21
- 1.29 March 24
- 1.30 March 26
- 1.31 March 28
- 1.32 March 31
- 1.33 April 2
- 1.34 April 4
- 1.35 April 7
- 1.36 April 9
- 1.37 April 11
- 1.38 April 14 (Demos)
CPSC 355 Scribe Notes
These notes exist as a class resource and study aid.
Jan 13
Jan 15
[img]http://imgur.com/wYMgidN[/img]
Jan 17
Jan 20
Binary
- base 2 number system
- went over converting from decimal to binary (simple ways to do this can be found in textbook)
Binary Addition
0 + 0 = 0 0 + 1 = 1 1 + 0 = 1 1 + 1 = 10
More detailed examples
1 0 0 1 + 1 0 0 0 ---------- 1 0 0 0 1
1 1 1 + 1 1 1 ---------- 1 1 1 0
Hexadecimal
- base 16 number system
- 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f
- a = 10, b = 11, c = 12, d = 13, e = 14, f = 15
- went over converting from decimal to hexadecimal (textbook)
- also looked at converting from binary to hex, and vice versa
Hex Addition
- hex numbers usually begin with '0x' to avoid confusion
- Ex. 0x1 is hex 1
- Ex. 0x14 is hex 14, decimal 20
0xa + 0x1 = 0xb 0xf + 0x1 = 0x10 0xa + 0xa = 0x14 0x14 + 0xf = 0x23
Boolean Table
a b SUM (xor) COUT (AND) AND OR XOR ------------------------------------------------------ 0 0 0 0 0 0 0 0 1 1 0 0 1 1 1 0 1 0 0 1 1 1 1 0 1 1 1 0
Jan 22
Jan 24
Jan 27
Jan 29
Signed and unsigned representations
Signed Unsigned 0000 0 0 0001 1 1 0010 2 2 0011 3 3 0100 4 4 0101 5 5 0110 6 6 0111 7 7 1000 -8 8 1001 -7 9 1010 -6 10 1011 -5 11 1100 -4 12 1101 -3 13 1110 -2 14 1111 -1 15
Signed and Unsigned bit conversions are done using the two's compliment.
0010 one's compliment 1101 Flipped two's compliment 1 add 1 1110 => -2 (1 being the sign bit)
We need 5 bits to represent -15
01111 10000 NOT 10001 NEG
convert 0110 mov a, 0x6 1001 neg a 1010
Sign bit is extended all the way through a 32 bit reg. eg: 1111...1111 1010
Shift right divides by two. Usually divides last bit and adds a 0 to the start. Shift right arithmetic replicated the first "sign" bit.
Jan 31
Feb 3
Intro to ELF: Executable & Linkable Format
e.g. Design a program that finds the largest integer:
(refer [1] for C code)
INPUT (STORAGE): array to store integer values from command line, variable to store largest integer (MAX), loop control variable/index into array
PROCESS: loop over input (array)
OUTPUT: report value of MAX
"No dynamically sized arrays are available in C."
There is structure to an ELF file. Check out each command in the READELF:
$man readelf
Notice important sections: .INIT, .TEXT (code of the program), .RODATA, .DATA (initialized global data... AKA MAX), .BSS (uninitialized global data... AKA the integers array)
Feb 5
Midterm Exam
5 pages. 6 problems, mostly short answer. 100 points. 50 minutes. No text. No notes. No calculator, computer, etc. Write answers directly on the exam sheet. Be prepared to write both C and ASM code.
Feb 7
ELF Format
.INIT ==> program's start & automatically generated by the compiler
.RODATA ==> read-only data
.DATA ==> initialized global variables
.BSS ==> uninitialized variables (filled with zeroes)
Analyze ELF Flowchart: [2]
"first.asm" ==> First Assembly Code in x86 NASN:
Note: 1 instruction per line. Everything executes top to bottom, unless you call it to jump elsewhere.
>start program
;;our first assembly program BITS 32 GLOBAL _start SECTION .text _start: ;the colon at the end declares a label. "_start" is arbitrary. nop mov ebx, 0x10 add ebx, ebx sar ebx, 1 ;;divide by 2 sub ebx, 0xf push ebx pop eax ;;more instructions: int, xor, or, and, hlt
>end program
To create "first.o": $nasm -f elf first.asm
To link: $ld -o firstx first.o
To run: ./firstx
To run in GDB: $gdb ./firstx
Feb 10
Machine Instruction.
This week we are going to look at how we can translate C into assembly code. Basically a language between the compiler and the CPU.
For example:
- C:
x = x + 1
- Assembly language:
add eax, 1
- How this is express in the CPU.
- SPARC and x86.
- RISC vs CISC.
AMD-K6-2 processor (the processor we saw the first days in class). AMD-K6-2 manual is way simpler than modern processors. Therefore, it would be useful to take a look at the manual.
8088 8-bit HMOS microprocessor 8808/8088-2 The 8088 don't use the 32bit registers such as EAX,EBX etc. Instead it uses AX BX CX which are 16bit registers.
- Internals:
- memory interface
- instruction stream byte queue (instruction registers)
- execution unit control system (decoder)
- A-bus (getting inputs either from memory or register file)
- arithmetic logic unit
- execution unit
- flags.
The execution unit --> 8 bits registers each. AX(AH AL), BX (BH BL).
- Also known as Register file.
Communication between a compiler/assembler and the CPU. How do we encode this directives? such that they are easily unpacked and executed by the CPU.
- What do I put into particular positions in the sequence of bits to indicate the operand?
- Is 32bit enough for architecture to express all data move, operations in just 5 bits?
Operands are immediate values (numbers). memory location or addresses --> 8 memory location. registers = dedicate at least 3 bits. is it enough to specify?
- Opcode
- Operands
- How bits are distributed. between memory address, opcode, operands and so on.
- This is fundamental during the design decisions.
- Page 26 of the manual you can see the Intel 8088 Instruction set summary.
- 0x90 is equivalent to NOP.
- For 8088 the NOP does not exists. So the instruction " XCHG AX,AX " is literally equal to 0x90.
- Lets say OR
- Register with Register.
- You would turn that into the bit sequence 000010dw mod reg r/m.
INT3 Type 3 --> 11001100
ADM-K6-2 has around 90 instructions.
- For the final project:
- A = ~ 45instructions.
- B= ~ 30 instructions
- C= ~ 20 instructions
- Control flow
- Logic/arithmetic, etc.
Feb 12
Feb 14
Feb 17, 19, 21: Reading Week, no class
Feb 24 (guest lecture)
Feb 26
Feb 28
March 3
WEEK 8: CONTROL FLOW
Reminder:
Structure of assembly source line
|LABEL| (optional) |INSTRUCTIONS| |OPERANDS| ; COMMENTS (optional)
- Every line has an implicit address
Program counter (pc):
- RISC: pc = pc + fixed number of bytes (ie 4)
- CISC: pc = pc + variable number of bytes
Straight line control flow: increments pc to next physical address
Control Flow transfer: branches to another instruction in program. (For example, call and jmp operations).
- Side effect of Call instruction is to push return address on stack (location of next instruction after the call instruction)
It is rare to have more than 6-7 instructions in straight-line format. Instead, basic blocks(small sequences of logically related instructions, terminated by control transfer) are used.
Side note: jae is unsigned version of jge
March 5
Control Flow cont'd How does the first line get executed?
- There are many steps between _start and main:
Compiler automatically generates code to get control flow started.
- Advice: look at block view to help understand program, not just line by line.
Decisional Control Structure IF and SWITCH
In an if statement, put constant on left to avoid errors For example, use If (2==x), instead of if (x==2)
After each case in a switch statement, use 'break'
To jump to a destination address loaded into register, you can use (for example)
mov rax, QWORD PTR[rax*8+0x4006d8] jmp rax
(0x4006d8 is base location, rax*8 is index into it)
March 7
Control Flow Cont'd
In C, loop constructs are FOR, WHILE, DO-WHILE
Difference between while and do-while is that do-while does loop once first before checking loop condition.
FOR:
For (i=o ; i<N ; i++) { DO THIS };
- Assignment happens once
- Condition checked every time at start
- Increment happens every time at end
WHILE:
i=0; While (I < N) {DO THIS ; i++; }
DO WHILE
i=0; do { DO THIS; i++; }while (i<n);
SPARC
32 registers
- g0-g7 global , g0 reserved
- l0-l7 local , l6 and l7 reserved
- i0-i7 input , i6 and i7 reserved
- o0-o7 output , 06 and 07 reserved
- nop used in SPARC due to pipelining. Spot is called DELAY SLOT in fetch-decode-execute cycle
- Used after call or branch instruction
Example: to do 17x3
mov 17, %o0 mov 13, %o1 call .mul nop
SPARC syntax similar to ATT (destination register comes last) Note: cannot define memory address with [ ]; use load and store instead.
March 10
Advanced Control Flow
- Another control flow construct: calls to functions (or subroutines, names will be used interchangeably).
- Advantages of using functions instead of just using loops:
- Ability to identify them by a name
- Ability to supply input parameters
- Consider the following C code:
program.c
int foo(int a, int b int c){ int ret; ret = a+b+c; return ret; } int main(){ int x; x=foo(1,2,3); return x; }
- main needs to be able to tell foo what parameters to use, to pass control to foo, and to receive the return value once foo is complete
- All of these tasks are accomplished through use of the activation record
- Activation records are stored on the stack
- An activation record exists for each function that is currently in progress
- (Note: the function calling convention described in these notes pertains to the X86 architecture - other architectures may have different calling conventions)
Activation Record Structure
The activation record is a contract between the caller and the callee.
The following diagram represents the structure of an X86 activation record (using the call to foo() from main() as an example):
------------------------------------------- | | | local variable space | (4) | | -------------------------------------------| | saved ebp | (3) -------------------------------------------| | saved return address | (2) -------------------------------------------| | first parameter (a=1) | - -------------------------------------------| | | second parameter (b=2) | |- (1) -------------------------------------------| | | third parameter (c=3) | - -------------------------------------------
(1) Caller pushes the arguments onto the stack in reverse order
(2) The function foo is invoked with the asm CALL instruction
- The return address is pushed onto the stack
- Equal to the the address of the instruction in main immediately following the call to foo
(Control now passes from the caller (main) to the callee (foo))
(3)ebp is pushed to the stack (push ebp)
- we want to save the current value of the frame pointer
- current value of esp moved into ebp (mov ebp, esp)
- ebp is now pointing to the current location of esp
(4) space is allocated for any local variables in foo (sub esp, 0x4)
- note that this is accomplished by subtracting from the current location of esp because the stack grows 'up' (i.e. towards the minimum address)
- The amount of space allocated depends on the number and type of local variables in the callee
(Do the work of function foo)
- Once the work of foo is complete, control is transferred back to main with asm RET command
- The following actions are a side-effect of Return:
- Looks up stored ebp and points current ebp back to this location
- Looks up stored return address and point eip here
- Moves return value (if any) into eax
- An important note: once the function returns, it's memory space is no longer being used - but it is not 'cleared' or 'zeroed'. Any information that was written to these places in memory during execution of the function is still there!
- This is why it's dangerous to:
- Use an uninitialized variable (there might already be data in the location that was allocated for your variable)
- Return a local pointer (once the function is complete, this pointer points to a location that might be used to store other data -so you don't necessarily know what exactly it will point to in the future)
March 12
Advanced Control Flow Continued
- Recall the activation record diagram discussed on March 10
- Also recall the C program (program.c) discussed on March 10
- In today's lecture, we examine the asm code corresponding to program.c and try to connect the component instructions with what we know about the activation record
Look at section <main>:
main: push ebp mov ebp, esp sub esp, 28 mov DWORD PTR [ebp-4], 0 mov DWORD PTR [esp+8], 3 mov DWORD PTR [esp+4], 2 mov DWORD PTR [esp], 1 call foo mov DWORD PTR [ebp-4], eax mov eax, DWORD PTR [ebp-4] leave ret
- push ebp
- Save current frame pointer
- mov ebp, esp
- Snapshot esp, save in frame pointer
- sub esp 0x1c (=28)
- Why 0x1c? Shouldn't we only need 0x4 to store the single local integer?
- Explanation: in one step, the compiler is allocating space for foo's local variables and foo's parameters
- mov DWORD PTR [ebp-0x4],0x0
- Local variable x
- mov DWORD PTR [esp+0x8], 0x3
- third parameter to foo
- mov DWORD PTR [esp+0x4], 0x2
- second parameter to foo
- mov DWORD PTR [esp], 0x1
- first parameter to foo
- call foo
- Control is transferred from main to foo
- Side effect of call: return address is pushed to stack
Now: Examine asm for <foo>:
foo: push ebp mov ebp, esp sub esp, 16 mov eax, DWORD PTR [ebp+12] mov edx, DWORD PTR [ebp+8] lea eax, [edx+eax] add eax, DWORD PTR [ebp+16] mov DWORD PTR [ebp-4], eax mov eax, DWORD PTR [ebp-4] leave ret
- push ebp
- mov ebp, esp
- sub esp, 16
- (Same procedure as setup for main)
- mov eax, DWORD PTR [ebp+12]
- Move first argument to foo into eax
- (Other instructions perform the calculation in foo)
- mov eax, DWORD PTR [ebp-4]
- move return value into eax in preparation for control transfer back to main
- ret
- reads return address off of stack and stores in eip
- now since eip is pointing at the return address (which is the instruction mov DWORD PTR [ebp-4], eax in main), control is now transferred back to main and execution continues from this point.
- Take-home messages of the day:
- Correspondence between the asm code and the activation record format / x86 calling convention
- How control is transferred between functions
- How parameters are communicated to a function
- How data is passed back to the calling function
March 14
Final Discussions on Function Handling at the ISA level
- Recall again the activation record diagram discussed on March 10
Some miscellaneous notes:
- Consider the function call foo(b(),bar())
- This is legal!
- It is up to the compiler which of the two argument functions is evaluated first
- What makes a local variable local?
- The variable exists in a temporary context. That is - when the function returns, the variable is no longer available to you
- The ISA (Instruction Set Architecture) affects what a high-level programming language is capable of doing
- Today: run the program.c executable through gdb and examine the state of the stack and registers at various points during execution
- An extensive gdb session was analyzed during this lecture - the full session is not reproduced here. Some of the key gdb commands that were used are as follows:
x/32x $esp <--Display the bytes stored at each address in the stack disassemble <--Examine what instruction is currently being executed
- The take home message for today:
- You can see the correspondence between the asm and the state of the stack at each point during the execution
- The real truth about what your program is doing lies in series of hex values that make up the stack and registers - not in your high level source code.
March 17
ABI: Application Binary Interface
Internal text/data vs. External text/data
- internal/external to the process
- process: program in exection
- PAS: Process Address Space (internal), essential construct that defines what is in our program
- metadata: PCB (Process Control Block) - OS
- PAS: Process Address Space (internal), essential construct that defines what is in our program
- process: program in exection
- so, something in the PAS is considered internal
- PAS can be considered a big array (enumerable), drawing below
- stored-program computer
- the location of text, data, etc. sections are chosen arbitrarily, but must be consistent for everything to work
- ELF sections aren't in the PAS, only address spaces and ranges
- Heap: region for pointing to data that isn't pre-defined for you
- ptr = Malloc(4); // # of bytes is argument
- could use sizeof (struct bar);
- ptr now contains an address of some space on the heap
- ptr = Malloc(4); // # of bytes is argument
Summing up,
- Internal: something that is in the PAS
- External: calling to other programs, etc. that are outside the PAS
- Process: program in execution
- processes involve the programmer, compiler, CPU, etc. all working correctly
We looked at this towards the end of class
ps aux | grep yes cat /proc/6023/maps
where the programs was one that simply printed hello in an infinite loop
March 19
March 21
March 24
Today, we will be looking at the differences between x86 and SPARC.
Which ISA is a better architecture? x86 or SPARC?
Well, let's first look at an ideal arch:
1. Easier to understand. 2. Hardware costs. 3. High-level functionality natively supported: amount, time, and quality. 4. Speed. 5. Power and heat issues. (Intel tends to slow down during over-heating because their chips can detect that it is getting hot, so it slows down as a result. AMD tries to satisfy what the user wants - performance; therefore, it overheats). 6. Ability to address primary memory (RAM). 7. Number of registers. 8. OS Support: availability of OS, and legacy issues.
FEATURES:
- Number of general registers, roles: x86: 8 general registers. SPARC: 32 general registers.
- Type of Machine It Is: x86: CISC. SPARC: LOAD/STORE RISC - have a large number of general registers. Primarily uses registers to do computations.
- Use of Registers/Calling Convention: x86: Stack/Memory SPARC: Registers.
- Addressing Modes: x86: Various: most instruction can manipulate them. SPARC: Only LOAD and STORE can speak to memory.
- Burden on Assembly Programmers: x86: Provide operations that do computations (Implicit Operands, Complex Instructions, and it is light in some sense, but also heavy). SPARC: Burden is light, relatively few instructions.
- Instruction Set/Instruction Format: x86: Large, many hundreds of instructions. Multiple addressing modes. Variable Length: 15 bytes. SPARC: Every instruction is 32 bits. Number of instructions is LESS than the number of x86 instructions.
Overall, which one is better? SPARC is much cleaner, it solves more modern complexity. There are 32 fast locations that will help make the program run faster. No need to memorize 92 instructions. SPARC is well designed.
March 26
March 28
March 31
April 2
Looked at example on Piazza
- some kind of struct instr array is a great idea to use
- semi_pos is a pointer, it is not remembering the index, it is remembering the address
- in the line buffer, whitespace characters help determine the beginning and end of instructions.
- that is, helps set ins_name_start and ins_name_end pointers
- define what may be a valid instruction (you still have to confirm that)
Converting to Bytes
Example
POP AX
|0x20|P|O|P|0x20|A|X|0x20| <--- line array
- 0x20 is the space key
program[0].op_type = POP_TYPE; .address = 0; // added to 100h .op1_type = reg_type .op1 = REG_AX;
- note, struct instr program[10]
int translate(){ int i = 0; for (i=0, i<PROG_SIZE; i++) { switch(program[i].op_type) { // a bunch of case statements per instruction that is supported case POP_TYPE: trans_pop(i); // translate pop and store it somewhere break; default: yell_at_user(); } } }
int trans_pop(int pop){ // 0101 1reg since we're using AX, reg = 000 // 0x 58 char ins; ins = 0x58; }
struct instr program[10]; char* trans_ins[10]; // getting started with saving translated instructions
- finally, since translated instructions are stored in an array, write them out into a file. This allows branching too, assuming bytes are stored in the right order.
April 4
April 7
April 9
Let's first look at two instructions: 'IN' and 'OUT'.
Input/Output:
To What? External Devices.
Why? To support interactivity and to acquire meaningful data.
How? Instruction format.
For example:
IN 255
However, something is missing. Where is the DESTINATION?
IN AX, 255 IN AL, 255
IN AX, DX
Basically, the data is fetched from the port and then written into the destinations.
The 'OUT' instruction is different:
OUT 255, AX
(Where you are writing AX into that port).
The OS provides you with these facility.
Input/Output is typically wrapped by the OS.
- Via system call layer. - Open, read, write, close, seek.
We continued to look at the 32-Architecture Software Developer Instruction-Set Reference Manual: Page 456/1,479, under the 'Operation' section.
Interrupts vs. Polling:
The simple solution is polling, where you have to write the correct assembly code, just have a loop. Keep jumping back to the loop and check if it is ready, polling is basically checking over and over again. It is a really simple loop and model to write.
But what is wrong with this? This is extremely inefficient. You are basically wasting cycles to do useless work.
This is where interrupt comes in. We will set up all the ports and make sure that we are able to communicate, then you can go ahead and do other computations. You have the ability to write, this is known as hardware interrupt. Hardware interrupts are generated by the hardware.
Software interrupts are int and int3 - that interrupts the CPU and causes it to do something else. Software interrupts are written by you in your assembly programs.