Courses/Computer Science/CPSC 355.W2014/Lecture Notes/ComplexTypes
Collections of primitive types.
- Arrays
- Structures
- Unions
Subtopics
- sizeof
- declarations in C
- appearance in assembly
- padding by compiler, alignment
- access a position or field via xxx
- location / representation, global, on stack
- global, static, initialized, uninitialized,
- pointers to structure instances
Contents
Arrays
Arrays are contiguous collections of data items of the same type.
#include <stdio.h> int i[2]; int data[16]; char name[8] = "michael"; char identifier[] = {0}; short beef[] = { 0xD, 0xE, 0xA, 0xD, 0xB, 0xE, 0xE, 0xF }; int main(int argc, char* argv[]){ int whereami[64]; fprintf(stdout, "sizeof(i) = %d\n" "sizeof(data) = %d\n" "sizeof(name) = %d\n" "sizeof(identifier) = %d\n" "sizeof(beef) = %d\n" "sizeof(whereami) = %d\n", sizeof(i), sizeof(data), sizeof(name), sizeof(identifier), sizeof(beef), sizeof(whereami)); whereami[60] = 0xaabb1000; whereami[0] = 0x41414141; whereami[63] = 0xffffffff; //is the following statement "legal"? //is this "possible"? //will this "work"? whereami[64] = 0x44eeeeee; return 0; }
This C code gets translated to:
080483e4 <main>: 80483e4: 55 push %ebp 80483e5: 89 e5 mov %esp,%ebp 80483e7: 83 e4 f0 and $0xfffffff0,%esp 80483ea: 81 ec 20 01 00 00 sub $0x120,%esp 80483f0: ba 34 85 04 08 mov $0x8048534,%edx 80483f5: a1 80 97 04 08 mov 0x8049780,%eax 80483fa: c7 44 24 1c 00 01 00 movl $0x100,0x1c(%esp) 8048401: 00 8048402: c7 44 24 18 10 00 00 movl $0x10,0x18(%esp) 8048409: 00 804840a: c7 44 24 14 01 00 00 movl $0x1,0x14(%esp) 8048411: 00 8048412: c7 44 24 10 08 00 00 movl $0x8,0x10(%esp) 8048419: 00 804841a: c7 44 24 0c 40 00 00 movl $0x40,0xc(%esp) 8048421: 00 8048422: c7 44 24 08 08 00 00 movl $0x8,0x8(%esp) 8048429: 00 804842a: 89 54 24 04 mov %edx,0x4(%esp) 804842e: 89 04 24 mov %eax,(%esp) 8048431: e8 e2 fe ff ff call 8048318 <fprintf@plt> 8048436: c7 84 24 10 01 00 00 movl $0xaabb1000,0x110(%esp) 804843d: 00 10 bb aa 8048441: c7 44 24 20 41 41 41 movl $0x41414141,0x20(%esp) 8048448: 41 8048449: c7 84 24 1c 01 00 00 movl $0xffffffff,0x11c(%esp) 8048450: ff ff ff ff 8048454: c7 84 24 20 01 00 00 movl $0x44eeeeee,0x120(%esp) 804845b: ee ee ee 44 804845f: b8 00 00 00 00 mov $0x0,%eax 8048464: c9 leave 8048465: c3 ret
Note the differences in how global variables are treated vs. the local array on main's stack (i.e., references to a static global address vs. an offset of esp register.
How many bytes long are .bss and .data sections? Does this match our expectation about what global variables were initialized?
(eye@mordor l14)$ readelf -S ax There are 30 section headers, starting at offset 0x8d0: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al ... [24] .data PROGBITS 08049760 000760 00001c 00 WA 0 0 4 [25] .bss NOBITS 08049780 00077c 000068 00 WA 0 0 32
What is in the .data section? What do we learn about the initialized character array? (For example, we see the content in .data, not a "pointer" to some other data somewhere else.)
(eye@mordor l14)$ readelf -x ".data" ax Hex dump of section '.data': 0x08049760 00000000 6d696368 61656c00 0d000e00 ....michael..... 0x08049770 0a000d00 0b000e00 0e000f00 ............ (eye@mordor l14)$ readelf -x ".bss" ax Section '.bss' has no data to dump.
What effect does running the program have in telling us about the sizes of these variables (and any inserted padding)?
(eye@mordor l14)$ ./ax sizeof(i) = 8 sizeof(data) = 64 sizeof(name) = 8 sizeof(identifier) = 1 sizeof(beef) = 16 sizeof(whereami) = 256
Simple Structures (Structs)
(eye@mordor l14)$ cat introstruct.c struct simple { char x; int data; }; struct simple s; int main(int argc, char* argv[]) { s.x = 0x41; s.data = 0xffffff00; return s.x; } (eye@mordor l14)$
Turns into this code (nm says that 's' is located at:
08049624 B s
So is 's' in the .data section or in .bss? (Based on what we know about uninitialized global variables, where should it be? The "B" above is a hint.)
[24] .data PROGBITS 08049618 000618 000004 00 WA 0 0 4 [25] .bss NOBITS 0804961c 00061c 000010 00 WA 0 0 4
This is the assembly representation of the code:
08048394 <main>: 8048394: 55 push ebp 8048395: 89 e5 mov ebp,esp 8048397: c6 05 24 96 04 08 41 mov BYTE PTR ds:0x8049624,0x41 804839e: c7 05 28 96 04 08 00 mov DWORD PTR ds:0x8049628,0xffffff00 80483a5: ff ff ff 80483a8: 0f b6 05 24 96 04 08 movzx eax,BYTE PTR ds:0x8049624 80483af: 0f be c0 movsx eax,al 80483b2: 5d pop ebp 80483b3: c3 ret
Complex Structures and Unions
Let's get a bit more complicated. We will declare a number of structures.
Other topics
- structures of arrays
- arrays of structures
Invoking write(2)
Also a quick overview of how to invoke the write(2) system call on Linux. (needed for HW4)