Take away points ------------------ process vs. thread different ways of thread implementation SA m:n model plus communication end-to-end management How workload affects different designs' performance Background ------------------ process vs. thread resource management execution unit help improve whole system help improve one application's throughput latency a process' `local' resource includes: file desciprtor, page table ... a thread's `local' resource includes: stack, signal, ... The implementation of process in kernel: every process has one PCB (process control block) a process is put in one or multiple queues (e.g., running Q) in kernel Context switch occurs in several ways: (1) the application calls yield (2) the application is blocked (3) somehow the control goes back to kernel (at timer interrupt or i/o), and kernel decides to preempt the running process Implementation of Thread --------------------- three major types: 1:1 (one kernel thread maps to one user level program thread/task) 1:n (..........................multiple ............) m:n (multiple ................multiple .............) How to implement a user-thread library (background) ---------------------------------------------- The user-library has a `task' queue; change from one task to the other is like a pure uesr-level jump, plus saving stack state and registers; saved states are put onto the user-level queue; when one task YIELD, it calls the user-library routine (no kernel involvement), the library routine selects one record in the queue and jumps to it. Scheduling (select one record from the queue) and synchronization are totally done at user-level. Why user-level thread library is better than kernel threading? 1. more flexible 2. context switch and synchronization do not need to go through kernel, faster Why user-level thread library has problems? 1. @ user-level, has integration problem with some system events (1) if one user-task blocked (e.g., in I/O), the whole process blocks, even if the user-level has many runnable tasks waiting to run. - One solution to this: use multiple kernel threads to support one user process (m:n model) (2) if kernel needs to make scheduling decision, very bad decision could be made because some priority/block information is kept @ user-level. e.g., when kernel wants to take one kernel thread away, which to take? Due to the 2-level schedule management and information disconnection, low-priority tasks might be picked in favor of high-priority one; spin-loop waiting task might be picked ... the notify thread; idle thread might be picked over thread doing meaningful work. 2. no preemption a user thread might eat all time without executing yield; user library cannot force it to stop Solution to above problems (scheduler activation): give another fresh kernel thread to a process when one of its user task blocks always let user-level library decide which user task to run Scheduler Activation ---------------------- m:n threading model with good communication between kernel-userlibrary Scheduling task split between user and kernel kernel: manage physical processor resources; coordinate among processes user-library: request/release processor resource to kernel; manage the user-thread assignment two sets of communication APIs kernel to user (upper calls) blocked ; //the thread state is kept in kernel now unblocked; //return thread state to user library preempted; //return thread state to user library addnewprocessor; user to kernel (system calls) request for new processors release idle processors some `invariants' maintained by SA * user-level thread library always knows how many ACTIVE kernel thread (SA) it has. * Above number does not change at blocking system calls; it only changes at preemption or adding new processors. * thread states are kept in three places in SA system (1) blocked threads are kept in kernel (2) running threads (one per processor) (3) runnable, but not running, threads are kept in user-lib Q Example1 what happens when a process gets forked Example2 what happens @ an blocking I/O Example3 what happens @ preemption? Other implementation issues: bulk upcalls special handling to critical sections the whole system uses `space sharing' instead of `time sharing' Impact of SA: used in Solaris and some other branches of Unix sort of discarded by Solaris, replaced by 1:1 kernel threading again SA does have better flexibility however, it is bad in signal handling/delivery; thread creation/destroy is serialized under SA; kernel threading has made huge progresses: faster synchronization (futex frequently not through kernel) faster thread creation and supporting more threads