Lecture #5

Administrivia

Solution to lab
/proc
Top-ten (revisited): as a shellscript
Labs: "artist's study for assignments"
Lab 2: posted

Previously on CSS 503A...

Key kernel responsibility: managing processes
- using fork, exec, close open, pipe, wait, we know enough to write a rudimentary shell
- userland code (string processing) to determine program(s) to run; system calls to perform actions
Open files stay open across fork/exec (unless, per file, you ask kernel not to)
Everything is a file
- programs are simpler, more flexible if they read from stdin, write to stdout
  - design principle: separation between mechanism & policy
    - decide at run time how to "wire up" input & output
  - simpler program does not have to do anything different when reading from keyboard vs. reading from file
  - shell syntax to specify fd 0, 1, 2
    - most language runtimes (standard libraries) wrap raw file descriptors into corresponding file objects
      - library provides text formatting (convenience) & buffering (efficiency)
Dataflow pipelines: simple "filter" programs can be combined to perform complex operations
- similar to functional style h(g(f(x)))
- or fluent style collection.map(f).fold(g).filter(h)
Exectuable file: passive data
- structured binary data: ELF (Executable & Linking Format), COFF (Common Object File Format), ...
- can be manipulated by any program that can read/write files (i.e. any program)
System loader (exec): loads data (i.e. your program) into process memory
- text (code)
- data
- heap: space for dynamic memory allocation (malloc/new)
  - may be adjusted dynamically during program execution
- call stack
Running state of program includes registers (esp. PC {Program Counter}, SP {Stack Pointer})
Running program:
- registers
- text (code)
- static data: initialized & uninitialized (misnomer: initialized to 0)
- call stack (or just "stack")
- heap (arena): dynamic memory allocation
- open files (descriptors)
- argv/envp
- exit status (on program termination)
Kernel view of process: data structure (process control block)
- register save space
- memory map
- table of open files (indexed by file descriptors)
- pid, ppid, uid, gid, nice, ...
- signal vector
Process state: running, not running
- more complicated
TODO: diagram
System scheduler: picks ready process & runs until it makes a system call, an interrupt occurs, time slice expires (timer interrupt)
More complicated refinements
- short-term vs. long-term waiting
  - fast system request vs. slow I/O operation
Signals: software analog of hardware interrupt
- h/w interrupt goes to kernel
- s/w interrupt: kernel manipulates userland state (data & registers)
Signal number: small system-defined integer (symbolic names defined in C header file)
- each signal has semantics & default action
- user makes system call to override (or restore) default behavior
  - signal 9 cannot be caught
Typical usage: trap/catch signal to perform cleanup action before program termination
- datacenter-scale management: grace period between SIGINT & SIGTERM for individual server shutdown
  - allow time for in-flight requests to run to completion without accepting new requests
Signals to notify process
- I/O is available (async I/O)
- child state change (call wait to reap zombie)
User-defined signals:SIGUSR1, SIGUSR2
- e.g. tell server to re-read config file
kill(2): send signal
- kill(1) command is thin wrapper around system call
- poor choice of name: doesn't necessarily kill signaled process
Cooperating processes
- simpler, more modular/flexible system designs
- latest incarnation: microservices
  - we keep "rediscovering" same basic principles
Cooperating processes need to communicate
- argv/exit status
  - only exchange information at program startup/termination
  - limited bandwidth, but useful for many cases
  - shell uses exit status as Boolean value (loops & conditionals)
- signals: wake up and do something!
- files: awkward & inefficient (klunky)
  - but it has been done
  - system calls to support this (e.g. async I/O)
- pipes
  - file-like: use read/write system calls
  - one-way communication (producer-consumer)
    - extremely common paradigm
    - but doesn't cover all cases
  - must be created by common ancestor (one end may be creator)
    - usually: shell
- named pipes
  - create special entry in filesystem
  - open by name (like any other file on filesystem)
    - use read/write system cals
  - works like regular pipe, except processes do not require common ancestor
  - more awkward than pipes for the most commmon use case
  - probably only used today for legacy applications & work-arounds
- networking
2 basic paradigms for communication:
- message passing (aka "shared nothing")
  - everything we've seeen so far (esp. pipes)
- shared memory (stay tuned)

Message Passing

Safer than shared memory
Higher-level than mutexes
Send/receive
- Go (language): channels
- Erlang: mailboxes
- C.A.R. Hoare: Communicating Sequential Processes
Pipes: unstructured data
- pass sequence of bytes
- requires application-level structuring & synchronization
Issues: buffering & synchronization
Blocking vs. non-blocking send

Interprocess Communication (cont.)

Posix IPC
- same goals as SysV IPC
- specifies API only
  - may be implemented as userland library that translates operations into native system calls
- goal: program portability
Message-passing (cont.)
- Unix-domain sockets
  - like pipes, but uses networking infrastructures
  - on same machine only
  - different protocol than TCP/IP & UDP/IP
- TCP/IP & UDP/IP
  - networking: can communicate with processes on other machines
  - loopback interface: use networking to talk to self
  - separate topic (rich)
  - "cluster operating system": treat entire datacenter that requires resource management & abstractions
System V IPC (original AT&T Unix, version 5)
- because pipes aren't alwasy the most convenient/efficient mechanism
- developed ~ early 1980s
- predates Posix IPC, but same basic goals
- predates rise of threading (next topic)
Message queue
- like structured pipe: can handle higher-level prototocols
- guaranty: no partial reads
Semaphores
- protect against concurrent updates
- more later (stay tuned)
Shared memory

System V Message Queues

int msgid = msgget(key_t key, int msgflg);

int msgsnd(int msgid, struct msgbuf *msgp, size_t msgsz, int msgflg);

struct msgbuf {
  long mtype,
  char mtext[]
};

C-style casting wizardry: superimpose msgbuf onto your own data structure.

mtype: use pattern matching on receive.

ssize_t msgrcv(
  int msgid,
  void * msgp,  // pointer to user-allocated space
  size_t msgsz,
  long msgtype, // pattern matching: select message with this type
  int msgflg
);

System V Semaphores

Semaphore (in real life): flag
- flag up: resource available
- flag down: wait for resource
Up/down (binary value): similar to mutex
Semaphores are more complicated: counter
- decrement counter
  - if counter >0, get token
  - otherwise, no tokens available: wait until counter > 0
- increment counter, release token

API:

int semid = semget(key_t key, int nsems, int semflag);  // create/open semaphore set

int sem_op(
  int semid,
  struct sembuf *sops
  unsigned nsops
);

struct sembuf {
    unsignes short sem_num,
    short sem_op,  // >0: increment; < 0: decrement or wait; 0: wait for zero
    short sem_flag  // IPC_NOWAIT: return immediately (failure if counter is 0)
);

Shared Memory

process must explicitly request this
- via system calls
handled by kernel memory management
- memory map: data structure
2 virtual memory segments/pages mapped to same real memory (frame)
- kernel is doing this mapping anyway
- about the fastest you can do IPC: no double buffering
kernel datastructure: shmid_ds: similar to file structure (inode)
- permissions
- timestamps
- size (#pages)
- frames (array of pointers)
- ...

API:

#include <sys/ipc.h>
#include <sys/shm.h>

// create/access shared memory
int shmid = shmget(
  key_t key,
  size_t size,
  int shmflg
);

shmid: small non-negative integer, like file descriptor (-1 for failure)
key_t: typedef for long
- "identifier" -- like filename
- must be shared across all processes wishing to share that piece of memory
ftok(3): take pathname & proj_id to generate key_t (hash)
shmflag: bitmapped flags
- symbolic constants, ored together
- owner/group/other permissions: like file
- IPC_CREAT | IPC_EXCL
  - create if it doesn't exist & fail if it does
So we hae a shmid (descriptor), now what?
- shmctl: get/set flags
- shmat/shmdt: attach/detach

void* shmat(int shmid, const void *shmaddr, in t shmflg)

prefer: shmaddr = NULL
- more portable
- let system decide where to put it
Example flag: SHM_RDONLY
- read-only segment
  - consumer in producer-consumer
  - subscriber in pub-sub
shmat returns void * pointer
- can use like any other pointer in C
  - be careful to avoid scribbling over memory at the same time another process is scribbling
  - need synchronization mechanism to avoid clobbering data
- cast to any meaningful type
- pointer arithmetic (arrays)
  - be careful not to scribble outside the lines
Usage:
- application requests ponter to block of n bytes (like malloc)
  - application must cast memory into appropriate data structures
  - applications must synchronize concurrent access
    - e.g. semaphore/mutex
Most common uses case: both processes execute same program
- why not just use threads?
JVM: multiple programs in single VM session
- multithreading
Sample code on slide (bad example)
- spin lock (busy wait)
- assumes count++/count-- is atomic (we'll revisit this)
mmap(2): map file to memory
- alternative to read/write
- kernel still needs to manage block I/O
Shared lilbraries .so (DLL in Windows)
- all programs do I/O:
  - standard library: buffered & formatted
- each process has "copy" of library
  - only need single copy in memory if all processes map virtual memory to same real memory
Dynamic linking: link at load time
- that's why linking & loading are often conflated

Threads

Processes are protected against stomping/clobbering each other
- separate memory spaces
- hardware controls set by kernel
Kernel has total access to processes's address space
- context switch: kernal saves registers (register state)
Have you ever written a (C++) program with a wild pointer?
- imagine debugging the problem if you wild pointer scribbles over someone else's process--or vise versa
Threads are more lightweight than processes
- more responsive
- lower resource consumption
- faster context switching (between threads)
  - only have to reload the registers
- useful for scaling algorithms
  - better solution for assignment 1 (see my Go solution)
- deadlock avoidance (slides show 2 pipes for bidirectional communication)
  - YMMV: in your instructor's humble opinion, threads make deadlock/race conditions more likely
- scalability: parallelization of algorithms on multi-core architectures
Process: single executable program
- all threads in process are running the same program
  - at different execution points
Similar to processes (but different)
Concurrent execution paths within single process
- common memory/resources (e.g. open files)
- each thread has its own stack (local variables)
  - smaller than system stack of single-threaded process
- data to manage threads (e.g. save registers)
Inter-thread communication: both easier & harder than inter-process communication
- shared memory
Downsides of processes
- fairly expensive to set up (machine resources)
- inter-process communication is klunky
Solution: abandon what makes processes great & have multiple "threads of control" within single process
- concurrent execution paths within single process
Thread may be implemented in
- kernel (system support)
- userland (library support)
- both

Deadlock Example

TODO: diagram

IMHO, example looks a bit farfetched

User (Green) Threads vs. Kernel Threads

can implement threads via
- kernel
  - e.g. Linux clone(2)
    - extended kernel data structure
- user library
  - Java
  - pthreads (Posix)
  - Erlang (calls them "processes" just to be confusing)
  - Go (goroutines)
user library: may or may not have kernel support
- full support: 1-1 mapping between library threads & kernel threads
  - library may be more convenient/portable (e.g. pthreads)
- no support
  - library has access to its own address space (since it's part of the process)
    - create application-level process control block data structure
    - handle own scheduling
      - non-preemptive (cooperative): expicit yield
      - timer: SIGALRM
    - textbook calls this many:1
- hybrid: multiplexing (many:many)
communication between thread (within process): both easier & harder
- shared memory
  - must take care to avoid race conditions & deadlocks
messge passing is becoming more popular: easier to reason about (safer)
- erlang/elixir
- Go
- Scala/Akka
functional style: immutable data
- no "side effects"

Threading Issues

interaction with fork/exec
thread cancellation
- what happens to shared resources (e.g. locks)?
- async vs deferred
signal handling
- synchronous (e.g. illegal memory access)
  - received by currently-running thread
- asynchronous (e.g. SIGINTR -- control-C)
  - received by main thread?
thread pools: relatively expensive to create threads: pay overhead cost at program startup
- "worker pool"
  - send unit-of-work to worker
Java threads
- method 1:
  1. subclass Thread
  2. override run method
- method 2:
  1. implement Runnable interface
    - public interface Runnable {public abstract void run();}
  2. create thread object initialized with Runnable instance
    - Thread t = new Thread(new RunnableImpl());
  3. start thread
    - t.start();
  4. wait for thread completion (wrap in try/catch block in case of InterruptedException)
    - t.join();
low-overhead threads (Erlang, Go, Scala/Akka): no need for worker pools
- just create as many threads as needed
- use other mechanisms for rate limiting (e.g. Go buffered channels)
other abstractions for concurrency
- futures/promises
- parallel container classes

Posix Thread (pthreads)

#include <pthread.h>
#include <unistd.h>

// User-defined thread function
// Takes pointer arg, returns pointer value.
//     - most-general API: user-defined data
//     - requires caution: not type-safe
void *thread_func(void *param);

int pthread_create(
    pthread_t *thread,
    const pthread_attr_t *attr,
    void *(*start_routine)(void *),
    void *arg
);

thread termination
1. thread calls pthread_exit()
  - exit status passed to pthread_join()
2. start_routine() returns
3. other thread calls pthread_cancel()
4. exit()
  - process exit
pthread attributes