Courses/Computer Science/CPSC 457.W2013/Assignments
Contents
A Note to Instructors
I hate having to write this kind of note due to a few bad apples. The material here is available (as it says at the bottom of the wiki page) under a Creative Commons Attribution/Share-Alike License. This includes the linked homework assignment descriptions (strictly speaking, the HW assignments are under an even more restrictive copyright, but with this declaration, I am hereby making them available under CCASA). This kind of license means you have certain responsibilities if you want to use this material in your course. Please abide by them. I'm happy to share content publicly (which is the whole reason this material is on the UofC wiki in the first place).
Submission Instructions
http://pages.cpsc.ucalgary.ca/~locasto/teaching/2013/CPSC457/SUBMIT-HOWTO
Assignment 1: ELF Modifications and Arbitrary System Calls
HW1 released 24 Jan 2013
Due: 8 February 2013 11:59pm MST
Assignment 2: LKMs and Process Memory Areas
Listing, modifying, and searching the process address space.
HW2 released 5 February 16:15 MST Assignment Specific Submission Guideline
Due: 15 March 2013 23:59 MST
Assignment 3: Modifying the Kernel, Delegating Access, and Measuring Threads
Modifying the way the kernel handles and tracks processes. Measuring different approaches to threading.
HW3 released 19 February 16:00 MST Assignment Specific Submission Guideline
Due: 12 April 2013 23:59 MST
HW2 Help / Hints
CPSC 457 Winter 2013 Thoughts on HW2 Michael E. Locasto
There are a variety (I count at least 4, and have implemented 3 of them myself) of ways to try tackling this problem. I will talk more about some of them in class on Thursday (tomorrow).
Key problem: try asking your LKM what current->pid (and current->comm) is versus the pid and comm for the target process you are trying to modify. I bet they don't match. So the set of page tables and memory regions you're working with is a virtual address space that is based on 'current', not the target. All virtual addresses will be relative to that PAS / namespace. Somehow you need to access the pages belonging to the mm_struct of the target process, not current.
Look at the code for copy_to_user and copy_from_user. Why won't these work as expected for the scenario we're dealing with?
http://lxr.linux.no/#linux+v2.6.32/arch/x86/lib/usercopy_32.c#L852
A note on how the kernel manages to "borrow" the page tables of the current process in "process context." This is really just a backgrounder, but a vital one. Note I don't expect you to modify active_mm:
http://lxr.linux.no/#linux+v3.8.2/Documentation/vm/active_mm.txt
An overview of memory concepts:
http://lxr.linux.no/#linux+v3.8.2/Documentation/vm/highmem.txt
A mechanism for borrowing a process's mm_struct so that copy_to/from_user work as expected. For informational purposes only...don't just blindly call this ... read the comments:
http://lxr.linux.no/#linux+v2.6.32/mm/mmu_context.c
Try to look for code (i.e., a function) that already does what you need it to do: that is, access another process's vm_areas. You may want to start by looking at the ptrace(2) system call and how it is implemented, especially the functionality that enables a caller of this system call to write into another process' address space (check out the man page for the request type, and then trace through the kernel code to see how that request type is serviced).
You could, in fact, write a user-space program that uses ptrace(2) to make the desired modification, but the assignment asks for an LKM. We will look at such a program in class tomorrow.
Links
These links are provided for extra information; they are not a pointer to "the answer"
- compiling kernel modules: http://tldp.org/LDP/lkmpg/2.6/html/x181.html
- http://www.makelinux.net/ldd3/chp-15-sect-3
- Making System Calls from Kernel space: http://www.linux-mag.com/id/651/
- scheduling tasks: http://tldp.org/LDP/lkmpg/2.6/html/x1211.html
- read and write /proc files: http://www.tldp.org/LDP/lkmpg/2.6/html/x769.html
- syscalls in lkm: http://tldp.org/LDP/lkmpg/2.6/html/x978.html
General Directions / Approaches
I present these more as suggestions to warm up your thinking rather than recommendations of implementation paths.
- ptrace(2): can you write a userland program that uses ptrace(2) to "solve" this assignment?
- can you write a system call to change memory directly? (surprise: this is ptrace(2)!)
- have a user-space ptrace-based helper that LKM calls to or that it reads from LKM psuedo-device
- look at how another module does this kind of memory manipulation
- hook something, then check for current==pid, then modify mem
- use shared memory (create/force a shm between insmod and target ts) (not recommended, this is messy)
- just save mm and active_mm, and replace with ts->mm, but will copy_to_user cooperate if CR3 is not updated?
- http://lxr.linux.no/linux+*/mm/mmu_context.c#L22 (use_mm() function)
- force insmod to become a "parent" of ./simple and ptrace it (PTRACE_POKE) (messy)
Relatively Easy Approach
Look at the function access_process_vm:
http://lxr.linux.no/#linux+v2.6.32/mm/memory.c#L3256
which uses get_user_pages() in same file. Note that ptrace uses this function for servicing ptrace_readdata:
http://lxr.linux.no/#linux+v2.6.32/kernel/ptrace.c#L350
The key trouble is that the symbol is not accessible to your LKM, even though it is "public" and kallsyms reports it. You can patch your kernel to export that symbol, or create a function pointer to it.
The Big Picture
It's hard to do something "unnatural" in the kernel: hard to imagine, hard to re-use code, and hard to add totally new stuff
You can "reuse" code by reimplementing a function you don't have access to, but this is a dangerous path b/c you wind up pulling on dependencies you may not be able to satisfy.
You should have learned something about the relative restrictions and utility of the EXPORT_SYMBOL macro.
You should have learned about the difference between "process context" and "kernel task context" while executing in kernel space: for a given task, there may or may not be a valid address space, and it is likely to be "borrowed" from the 'current' process, whatever that happens to be.
I hope we have gotten you interested about paging and memory management and the complexity thereof.
Links
- the "fork" implementation has some routines that create and manipulate the mm_struct, starting here: http://lxr.linux.no/#linux+v2.6.32/kernel/fork.c#L445
- http://lxr.linux.no/#linux+v2.6.32/arch/x86/include/asm/uaccess.h#L80
- http://lxr.linux.no/#linux+v2.6.32/arch/x86/lib/usercopy_32.c#L852
- what does copy to user do underneath the hood? http://lxr.linux.no/#linux+v2.6.32/arch/x86/lib/usercopy_32.c#L717
- http://lxr.linux.no/#linux+v2.6.32/include/linux/mm.h#L819
- http://lxr.linux.no/#linux+v2.6.32/mm/memory.c#L3256
- what does ptrace do? what does generic_ptrace_pokedata do? http://lxr.linux.no/#linux+v3.8.2/kernel/ptrace.c#L717
- the use_mm() function: http://lxr.linux.no/#linux+v2.6.32/mm/mmu_context.c (for clarifying concepts)
- from the Hacker Curriculum: http://www.thc.org/papers/LKM_HACKING.html
- An IBM article on "standard" user-to-kernel memory manipulation. We are not doing standard manipulation here. http://www.ibm.com/developerworks/linux/library/l-kernel-memory-access/index.html
- scheduling tasks in an LKM: http://tldp.org/LDP/lkmpg/2.6/html/x1211.html