Courses/Computer Science/CPSC 355.W2014/Lecture Notes/IntroC2ASM

From wiki.ucalgary.ca
Jump to: navigation, search

(An Introduction To) The Translation of C code to x86 Assembly

The lesson from 13 Jan is really:

Program = Code + Data

or rather:

Program = Stored(Code) + Stored(Data)

and thus the internal workings of a CPU (i.e., a digital machine for accomplishing symbol manipulation or numerical calculation) must support the execution of code (i.e., an ALU+instruction decoder) and the manipulation of data (e.g., a set of registers and memory). These requirements imply the existence of components that can accomplish code execution and store, read, & write data.

Update "ChalkSim" pictures with:

  • memory access discipline:
    • operation: r/w in a certain way
    • MAR
    • MDR
    • stack discipline (e.g., x86)

Take a walk through writing and compiling a simple C program.

  • machine platform: VM or network host
  • editor: emacs, vim, nano, pico, gedit, etc.
  • compiler: gcc
  • assembler: nasm
  • debugger: gdb
  • tools: objdump, udcli

We wrote this program:

(eye@mordor l4)$ more greet.c
#include <stdio.h>
int main(int argc,
	 char* argv[])
{
  int x;
  x = fprintf(stdout,
	      "hello, 355\n");
  return x;
}
(eye@mordor l4)$ 

and compiled it via the command:

(eye@mordor l4)$ gcc -Wall -o gx greet.c

We also asked gcc to produce an assembly representation via the -S flag:

(eye@mordor l4)$ gcc -S greet.c

That command output the file greet.s:

(eye@mordor l4)$ cat greet.s
	.file	"greet.c"
	.section	.rodata
.LC0:
	.string	"hello, 355\n"
	.text
.globl main
	.type	main, @function
main:
	pushl	%ebp
	movl	%esp, %ebp
	andl	$-16, %esp
	subl	$32, %esp
	movl	$.LC0, %edx
	movl	stdout, %eax
	movl	%edx, 4(%esp)
	movl	%eax, (%esp)
	call	fprintf
	movl	%eax, 28(%esp)
	movl	28(%esp), %eax
	leave
	ret
	.size	main, .-main
	.ident	"GCC: (GNU) 4.4.7 20120313 (Red Hat 4.4.7-3)"
	.section	.note.GNU-stack,"",@progbits
(eye@mordor l4)$ 

When examining the compiled 'gx' file via objdump:

(eye@mordor l4)$ objdump -d -M intel gx

we saw this assembly code (in Intel syntax):

080483e4 <main>:
 80483e4:	55                   	push   ebp
 80483e5:	89 e5                	mov    ebp,esp
 80483e7:	83 e4 f0             	and    esp,0xfffffff0
 80483ea:	83 ec 20             	sub    esp,0x20
 80483ed:	ba d4 84 04 08       	mov    edx,0x80484d4
 80483f2:	a1 80 96 04 08       	mov    eax,ds:0x8049680
 80483f7:	89 54 24 04          	mov    DWORD PTR [esp+0x4],edx
 80483fb:	89 04 24             	mov    DWORD PTR [esp],eax
 80483fe:	e8 15 ff ff ff       	call   8048318 <fprintf@plt>
 8048403:	89 44 24 1c          	mov    DWORD PTR [esp+0x1c],eax
 8048407:	8b 44 24 1c          	mov    eax,DWORD PTR [esp+0x1c]
 804840b:	c9                   	leave  
 804840c:	c3                   	ret    
 804840d:	90                   	nop
 804840e:	90                   	nop
 804840f:	90                   	nop