CPSC 355 Scribe Notes

These notes exist as a class resource and study aid.

Jan 13

Jan 15

[img]http://imgur.com/wYMgidN[/img]

Jan 17

link title

Jan 20

Binary

base 2 number system
went over converting from decimal to binary (simple ways to do this can be found in textbook)

Binary Addition

More detailed examples

     1 0 0 1
   + 1 0 0 0
   ----------
   1 0 0 0 1

     1 1 1
   + 1 1 1
   ----------
   1 1 1 0

Hexadecimal

base 16 number system
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f
- a = 10, b = 11, c = 12, d = 13, e = 14, f = 15
went over converting from decimal to hexadecimal (textbook)
also looked at converting from binary to hex, and vice versa

Hex Addition

hex numbers usually begin with '0x' to avoid confusion
- Ex. 0x1 is hex 1
- Ex. 0x14 is hex 14, decimal 20

   0xa + 0x1 = 0xb
   0xf + 0x1 = 0x10
   0xa + 0xa = 0x14
   0x14 + 0xf = 0x23

Boolean Table

   a    b    SUM (xor)    COUT (AND)    AND    OR    XOR
   ------------------------------------------------------
   0    0       0             0          0      0     0  
   0    1       1             0          0      1     1
   1    0       1             0          0      1     1
   1    1       0             1          1      1     0

Jan 22

Jan 24

Jan 27

Jan 29

Signed and unsigned representations

              Signed        Unsigned
 0000          0                   0
 0001          1                   1
 0010          2                   2
 0011          3                   3
 0100          4                   4
 0101          5                   5
 0110          6                   6
 0111          7                   7
 
 1000         -8                   8
 1001         -7                   9
 1010         -6                  10  
 1011         -5                  11
 1100         -4                  12
 1101         -3                  13
 1110         -2                  14
 1111         -1                  15

Signed and Unsigned bit conversions are done using the two's compliment.

                                     0010                     
  one's compliment                   1101         Flipped
  two's compliment                      1         add 1
                                     1110          =>  -2    (1 being the sign bit)

Logic Tables

We need 5 bits to represent -15

  01111
  10000    NOT
  10001    NEG

convert 0110 mov a, 0x6 1001 neg a 1010

Sign bit is extended all the way through a 32 bit reg. eg: 1111...1111 1010

Shift right divides by two. Usually divides last bit and adds a 0 to the start. Shift right arithmetic replicated the first "sign" bit.

Jan 31

Feb 3

Intro to ELF: Executable & Linkable Format

e.g. Design a program that finds the largest integer:

(refer [1] for C code)

INPUT (STORAGE): array to store integer values from command line, variable to store largest integer (MAX), loop control variable/index into array

PROCESS: loop over input (array)

OUTPUT: report value of MAX

"No dynamically sized arrays are available in C."

There is structure to an ELF file. Check out each command in the READELF:

$man readelf

Notice important sections: .INIT, .TEXT (code of the program), .RODATA, .DATA (initialized global data... AKA MAX), .BSS (uninitialized global data... AKA the integers array)

Feb 5

Midterm Exam

5 pages. 6 problems, mostly short answer. 100 points. 50 minutes. No text. No notes. No calculator, computer, etc. Write answers directly on the exam sheet. Be prepared to write both C and ASM code.

Feb 7

ELF Format

.INIT ==> program's start & automatically generated by the compiler

.RODATA ==> read-only data

.DATA ==> initialized global variables

.BSS ==> uninitialized variables (filled with zeroes)

Analyze ELF Flowchart: [2]

"first.asm" ==> First Assembly Code in x86 NASN:

Note: 1 instruction per line. Everything executes top to bottom, unless you call it to jump elsewhere.

>start program

   ;;our first assembly program
   BITS 32
   GLOBAL _start
   SECTION .text
   _start:  ;the colon at the end declares a label. "_start" is arbitrary.
           nop
           mov ebx, 0x10
           add ebx, ebx
           sar ebx, 1 ;;divide by 2
           sub ebx, 0xf
           push ebx
           pop eax
           ;;more instructions: int, xor, or, and, hlt

>end program

To create "first.o": $nasm -f elf first.asm

To link: $ld -o firstx first.o

To run: ./firstx

To run in GDB: $gdb ./firstx

Feb 10

Machine Instruction.

This week we are going to look at how we can translate C into assembly code. Basically a language between the compiler and the CPU.

For example:

C:

  x = x + 1

Assembly language:

  add         eax,  1

How this is express in the CPU.
SPARC and x86.
RISC vs CISC.

AMD-K6-2 processor (the processor we saw the first days in class). AMD-K6-2 manual is way simpler than modern processors. Therefore, it would be useful to take a look at the manual.

8088 8-bit HMOS microprocessor 8808/8088-2 The 8088 don't use the 32bit registers such as EAX,EBX etc. Instead it uses AX BX CX which are 16bit registers.

Internals:
- memory interface
- instruction stream byte queue (instruction registers)
- execution unit control system (decoder)
- A-bus (getting inputs either from memory or register file)
- arithmetic logic unit
- execution unit
- flags.

The execution unit --> 8 bits registers each. AX(AH AL), BX (BH BL).

Also known as Register file.

Communication between a compiler/assembler and the CPU. How do we encode this directives? such that they are easily unpacked and executed by the CPU.

What do I put into particular positions in the sequence of bits to indicate the operand?

Is 32bit enough for architecture to express all data move, operations in just 5 bits?

Operands are immediate values (numbers). memory location or addresses --> 8 memory location. registers = dedicate at least 3 bits. is it enough to specify?

Opcode
Operands
How bits are distributed. between memory address, opcode, operands and so on.
This is fundamental during the design decisions.
Page 26 of the manual you can see the Intel 8088 Instruction set summary.
0x90 is equivalent to NOP.
For 8088 the NOP does not exists. So the instruction " XCHG AX,AX " is literally equal to 0x90.

Lets say OR
- Register with Register.
- You would turn that into the bit sequence 000010dw mod reg r/m.

INT3 Type 3 --> 11001100

ADM-K6-2 has around 90 instructions.

For the final project:
- A = ~ 45instructions.
- B= ~ 30 instructions
- C= ~ 20 instructions
- Control flow
- Logic/arithmetic, etc.

Feb 12

Feb 14

Feb 17, 19, 21: Reading Week, no class

Feb 24 (guest lecture)

Feb 26

Feb 28

March 3

WEEK 8: CONTROL FLOW

Reminder: Structure of assembly source line

Every line has an implicit address

Program counter (pc):

- RISC: pc = pc + fixed number of bytes (ie 4)

- CISC: pc = pc + variable number of bytes

Straight line control flow: increments pc to next physical address

Control Flow transfer: branches to another instruction in program. (For example, call and jmp operations).

- Side effect of Call instruction is to push return address on stack (location of next instruction after the call instruction)

It is rare to have more than 6-7 instructions in straight-line format. Instead, basic blocks(small sequences of logically related instructions, terminated by control transfer) are used.

Side note: jae is unsigned version of jge

March 5

Control Flow cont'd How does the first line get executed?

There are many steps between _start and main:

Compiler automatically generates code to get control flow started.

Advice: look at block view to help understand program, not just line by line.

Decisional Control Structure IF and SWITCH

In an if statement, put constant on left to avoid errors For example, use If (2==x), instead of if (x==2)

After each case in a switch statement, use 'break'

To jump to a destination address loaded into register, you can use (for example)

  mov rax, QWORD PTR[rax*8+0x4006d8]
  jmp rax

(0x4006d8 is base location, rax*8 is index into it)

March 7

Control Flow Cont'd

In C, loop constructs are FOR, WHILE, DO-WHILE

Difference between while and do-while is that do-while does loop once first before checking loop condition.

FOR:

    For (i=o ; i<N ; i++)
         { DO THIS };

Assignment happens once
Condition checked every time at start
Increment happens every time at end

WHILE:

    i=0;
    While (I < N)
       {DO THIS ;
          i++; }

DO WHILE

    i=0;
    do {
       DO THIS;
         i++;
        }while (i<n);

SPARC

32 registers

g0-g7 global , g0 reserved
l0-l7 local , l6 and l7 reserved
i0-i7 input , i6 and i7 reserved
o0-o7 output , 06 and 07 reserved

nop used in SPARC due to pipelining. Spot is called DELAY SLOT in fetch-decode-execute cycle
Used after call or branch instruction

Example: to do 17x3

           mov 17, %o0
           mov 13, %o1
           call .mul
           nop

SPARC syntax similar to ATT (destination register comes last) Note: cannot define memory address with [ ]; use load and store instead.

March 10

Advanced Control Flow

Another control flow construct: calls to functions (or subroutines, names will be used interchangeably).
Advantages of using functions instead of just using loops:
- Ability to identify them by a name
- Ability to supply input parameters

Consider the following C code:

program.c

   int foo(int a, int b int c){
       int ret;
       ret = a+b+c;
       return ret;
   }
   int main(){
       int x;
       x=foo(1,2,3);
       return x;
   }

main needs to be able to tell foo what parameters to use, to pass control to foo, and to receive the return value once foo is complete
All of these tasks are accomplished through use of the activation record
Activation records are stored on the stack
An activation record exists for each function that is currently in progress
(Note: the function calling convention described in these notes pertains to the X86 architecture - other architectures may have different calling conventions)

Activation Record Structure

The activation record is a contract between the caller and the callee.

The following diagram represents the structure of an X86 activation record (using the call to foo() from main() as an example):

   -------------------------------------------
   |                                          |
   |          local variable space            | (4)
   |                                          |
   -------------------------------------------|
   |               saved ebp                  | (3)      
   -------------------------------------------|
   |           saved return address           | (2)
   -------------------------------------------|
   |          first parameter (a=1)           | - 
   -------------------------------------------|  | 
   |          second parameter (b=2)          |  |- (1)
   -------------------------------------------|  |
   |           third parameter (c=3)          | -
   -------------------------------------------

(1) Caller pushes the arguments onto the stack in reverse order

(2) The function foo is invoked with the asm CALL instruction

The return address is pushed onto the stack
- Equal to the the address of the instruction in main immediately following the call to foo

(Control now passes from the caller (main) to the callee (foo))

(3)ebp is pushed to the stack (push ebp)

we want to save the current value of the frame pointer
current value of esp moved into ebp (mov ebp, esp)
- ebp is now pointing to the current location of esp

(4) space is allocated for any local variables in foo (sub esp, 0x4)

note that this is accomplished by subtracting from the current location of esp because the stack grows 'up' (i.e. towards the minimum address)
The amount of space allocated depends on the number and type of local variables in the callee

(Do the work of function foo)

Once the work of foo is complete, control is transferred back to main with asm RET command
The following actions are a side-effect of Return:
- Looks up stored ebp and points current ebp back to this location
- Looks up stored return address and point eip here
- Moves return value (if any) into eax

An important note: once the function returns, it's memory space is no longer being used - but it is not 'cleared' or 'zeroed'. Any information that was written to these places in memory during execution of the function is still there!
This is why it's dangerous to:
- Use an uninitialized variable (there might already be data in the location that was allocated for your variable)
- Return a local pointer (once the function is complete, this pointer points to a location that might be used to store other data -so you don't necessarily know what exactly it will point to in the future)

March 12

Advanced Control Flow Continued

Recall the activation record diagram discussed on March 10
Also recall the C program (program.c) discussed on March 10
In today's lecture, we examine the asm code corresponding to program.c and try to connect the component instructions with what we know about the activation record

Look at section <main>:

   main:
           push    ebp
           mov     ebp, esp
           sub     esp, 28
           mov     DWORD PTR [ebp-4], 0
           mov     DWORD PTR [esp+8], 3
           mov     DWORD PTR [esp+4], 2
           mov     DWORD PTR [esp], 1
           call    foo
           mov     DWORD PTR [ebp-4], eax
           mov     eax, DWORD PTR [ebp-4]
           leave
           ret

push ebp
- Save current frame pointer
mov ebp, esp
- Snapshot esp, save in frame pointer
sub esp 0x1c (=28)
- Why 0x1c? Shouldn't we only need 0x4 to store the single local integer?
- Explanation: in one step, the compiler is allocating space for foo's local variables and foo's parameters
mov DWORD PTR [ebp-0x4],0x0
- Local variable x
mov DWORD PTR [esp+0x8], 0x3
- third parameter to foo
mov DWORD PTR [esp+0x4], 0x2
- second parameter to foo
mov DWORD PTR [esp], 0x1
- first parameter to foo
call foo
- Control is transferred from main to foo
- Side effect of call: return address is pushed to stack

Now: Examine asm for <foo>:

   foo:
       push    ebp
       mov     ebp, esp
       sub     esp, 16
       mov     eax, DWORD PTR [ebp+12]
       mov     edx, DWORD PTR [ebp+8]
       lea     eax, [edx+eax]
       add     eax, DWORD PTR [ebp+16]
       mov     DWORD PTR [ebp-4], eax
       mov     eax, DWORD PTR [ebp-4]
       leave
       ret

push ebp
mov ebp, esp
sub esp, 16
- (Same procedure as setup for main)
mov eax, DWORD PTR [ebp+12]
- Move first argument to foo into eax
(Other instructions perform the calculation in foo)
mov eax, DWORD PTR [ebp-4]
- move return value into eax in preparation for control transfer back to main
ret
- reads return address off of stack and stores in eip
- now since eip is pointing at the return address (which is the instruction mov DWORD PTR [ebp-4], eax in main), control is now transferred back to main and execution continues from this point.

Take-home messages of the day:
- Correspondence between the asm code and the activation record format / x86 calling convention
- How control is transferred between functions
- How parameters are communicated to a function
- How data is passed back to the calling function

March 14

Final Discussions on Function Handling at the ISA level

Recall again the activation record diagram discussed on March 10

Some miscellaneous notes:

Consider the function call foo(b(),bar())
- This is legal!
- It is up to the compiler which of the two argument functions is evaluated first

What makes a local variable local?
- The variable exists in a temporary context. That is - when the function returns, the variable is no longer available to you

The ISA (Instruction Set Architecture) affects what a high-level programming language is capable of doing

Today: run the program.c executable through gdb and examine the state of the stack and registers at various points during execution

An extensive gdb session was analyzed during this lecture - the full session is not reproduced here. Some of the key gdb commands that were used are as follows:

   x/32x $esp     <--Display the bytes stored at each address in the stack
   disassemble    <--Examine what instruction is currently being executed

The take home message for today:
- You can see the correspondence between the asm and the state of the stack at each point during the execution
- The real truth about what your program is doing lies in series of hex values that make up the stack and registers - not in your high level source code.

March 17

ABI: Application Binary Interface

Internal text/data vs. External text/data

internal/external to the process
- process: program in exection
  - PAS: Process Address Space (internal), essential construct that defines what is in our program
    - metadata: PCB (Process Control Block) - OS

so, something in the PAS is considered internal

PAS can be considered a big array (enumerable), drawing below

PAS

stored-program computer

the location of text, data, etc. sections are chosen arbitrarily, but must be consistent for everything to work

ELF sections aren't in the PAS, only address spaces and ranges

Heap: region for pointing to data that isn't pre-defined for you
- ptr = Malloc(4); // # of bytes is argument
  - could use sizeof (struct bar);
  - ptr now contains an address of some space on the heap

Summing up,

Internal: something that is in the PAS

External: calling to other programs, etc. that are outside the PAS

Process: program in execution

processes involve the programmer, compiler, CPU, etc. all working correctly

We looked at this towards the end of class

ps aux | grep yes
cat /proc/6023/maps

where the programs was one that simply printed hello in an infinite loop

March 19

March 21

March 24

Today, we will be looking at the differences between x86 and SPARC.

Which ISA is a better architecture? x86 or SPARC?

Well, let's first look at an ideal arch:

1. Easier to understand.  

2. Hardware costs. 

3. High-level functionality natively supported: amount, time, and quality. 

4. Speed. 

5. Power and heat issues. 
     (Intel tends to slow down during over-heating because their chips can detect that it is getting hot,
     so it slows down as a result. AMD tries to satisfy what the user wants - performance; therefore, it overheats).

6. Ability to address primary memory (RAM). 

7. Number of registers. 

8. OS Support: availability of OS, and legacy issues.

FEATURES:

- Number of general registers, roles:
    x86: 8 general registers. 
    SPARC: 32 general registers.

- Type of Machine It Is:
    x86: CISC.
    SPARC: LOAD/STORE RISC - have a large number of general registers. Primarily uses registers to do computations.

- Use of Registers/Calling Convention:
    x86: Stack/Memory
    SPARC: Registers.

- Addressing Modes:
    x86: Various: most instruction can manipulate them. 
    SPARC: Only LOAD and STORE can speak to memory.

- Burden on Assembly Programmers:
    x86: Provide operations that do computations (Implicit Operands, Complex Instructions, and it is light in some sense, but also heavy).
    SPARC: Burden is light, relatively few instructions.

- Instruction Set/Instruction Format:
    x86: Large, many hundreds of instructions. Multiple addressing modes. Variable Length: 15 bytes.
    SPARC: Every instruction is 32 bits. Number of instructions is LESS than the number of x86 instructions.

Overall, which one is better? SPARC is much cleaner, it solves more modern complexity. There are 32 fast locations that will help make the program run faster. No need to memorize 92 instructions. SPARC is well designed.

March 26

March 28

March 31

April 2

Looked at example on Piazza

some kind of struct instr array is a great idea to use
semi_pos is a pointer, it is not remembering the index, it is remembering the address
in the line buffer, whitespace characters help determine the beginning and end of instructions.
that is, helps set ins_name_start and ins_name_end pointers
define what may be a valid instruction (you still have to confirm that)

Converting to Bytes

Example

POP AX

   |0x20|P|O|P|0x20|A|X|0x20| <--- line array

0x20 is the space key

   program[0].op_type = POP_TYPE;
             .address = 0; // added to 100h
             .op1_type = reg_type
             .op1 = REG_AX;

note, struct instr program[10]

int translate(){
   int i = 0;
   for (i=0, i<PROG_SIZE; i++)
   {
       switch(program[i].op_type)
       {
           // a bunch of case statements per instruction that is supported
           case POP_TYPE:
               trans_pop(i); // translate pop and store it somewhere
                   break;
               default:
                   yell_at_user();
       }
   }
}

int trans_pop(int pop){
   // 0101 1reg    since we're using AX, reg = 000
   // 0x 58
   char ins;
   ins = 0x58;
}

   struct instr program[10];
   char* trans_ins[10]; // getting started with saving translated instructions

finally, since translated instructions are stored in an array, write them out into a file. This allows branching too, assuming bytes are stored in the right order.

April 4

April 7

April 9

Let's first look at two instructions: 'IN' and 'OUT'.

Input/Output:

To What? External Devices.

Why? To support interactivity and to acquire meaningful data.

How? Instruction format.

For example:

   IN     255

However, something is missing. Where is the DESTINATION?

   IN     AX, 255
   IN     AL, 255

   IN     AX, DX

Basically, the data is fetched from the port and then written into the destinations.

The 'OUT' instruction is different:

    OUT    255, AX

    (Where you are writing AX into that port).

The OS provides you with these facility.

Input/Output is typically wrapped by the OS.

- Via system call layer.
- Open, read, write, close, seek.

We continued to look at the 32-Architecture Software Developer Instruction-Set Reference Manual: Page 456/1,479, under the 'Operation' section.

Interrupts vs. Polling:

The simple solution is polling, where you have to write the correct assembly code, just have a loop. Keep jumping back to the loop and check if it is ready, polling is basically checking over and over again. It is a really simple loop and model to write.

But what is wrong with this? This is extremely inefficient. You are basically wasting cycles to do useless work.

This is where interrupt comes in. We will set up all the ports and make sure that we are able to communicate, then you can go ahead and do other computations. You have the ability to write, this is known as hardware interrupt. Hardware interrupts are generated by the hardware.

Software interrupts are int and int3 - that interrupts the CPU and causes it to do something else. Software interrupts are written by you in your assembly programs.

Courses/Computer Science/CPSC 355.W2014/Lecture Notes/Scribe

Contents