CSS 390: Notes from Lecture 7 (DRAFT)

Administrivia

Refactoring

Informally, we've already seen an example of refactoring in the http server shown in Lecture 2:

Architectural integrity: how the system is decomposed (into modules, packages, objects). Robust systems have high intra-modular cohesion and low inter-modular coupling. As software evolves, the cumulative effect of special cases, patches, work-arounds, hacks, and kludges degrades the system's architectural integrity.

Technical Debt represents the increased costs of evolving software systems with suboptimal design.

Why don't we build Snap-on quality software in the first place?

Code Smells: easily-observed indication of possible problems. Numerous catalogued odors including:

Refactoring: a change made to the internal structure of software to make it easer to understand and cheaper to modify without changing its external behavior

Key idea: refactoring is to support evolution of the codebase.

Martin Fowler, Refactoring: Improving the Design of Existing Code: exposition plus catalog of refactoring patterns

Cost of change:

Need to develop a dual mindset (angel on one shoulder, devil on the other):

  1. How can we make this better?
  2. Is this good enough?

Code that works is usually better than code that is pretty.

Key ideas:

Examples

Check out the version history of the top-ten word count at github.com. Note which changes are primarily refactoring and which are bugfixes or enhancements.

Example of a code small from the diskmon manager package:

if utilization > thresholdRed {
	if thisAlert := alerts[disk]; thisAlert != nil {
		if thisAlert.Level() != alert.Red {
			thisAlert.Reset(alert.Red, fmt.Sprintf("%s: over %5.1f% (%.1f%%)",
				disk, thresholdRed, utilization))
		}
	} else {
		alerts[disk] = alert.New(alert.Red, fmt.Sprintf("%s: over %.1f%% (%.1f%%)",
			disk, thresholdRed, utilization))
	}
} else if utilization > thresholdYellow {
	if thisAlert := alerts[disk]; thisAlert != nil {
		if thisAlert.Level() != alert.Yellow {
			thisAlert.Reset(alert.Yellow, fmt.Sprintf("%s: over %.1f%% (%.1f%%)",
				disk, thresholdYellow, utilization))
		}
	} else {
		alerts[disk] = alert.New(alert.Yellow, fmt.Sprintf("%s: over %.1f%% (%.1f%%)",
			disk, thresholdYellow, utilization))
	}
}

Simple patterns of change:

// replace temp with query:


// before:
basePrice := this.quantity * this.price
if basePrice > 1000 {
	return basePrice * 0.95
} else {
	return basePrice * 0.98
}


// after:
if this.basePrice() < 1000 {
	return this.basePrice() * 0.95
} else {
	return this.basePrice() * 0.98
}


// more cleanup:
if this.basePrice() < this.VolumeDiscountThreshold {
	return this.basePrice() * 0.95
} else {
	return this.basePrice() * 0.98
}


// idiomatic Go:
if this.basePrice() < this.VolumeDiscountThreshold {
	return this.basePrice() * 0.95
}
return this.basePrice() * 0.98
package order

import "time"

type Item struct {
	//...
	UnitPrice int
	//...
}

type Order struct {
	Quantity int
	Item     Item
	Date     time.Time
	//...
}

type Sale struct {
	Start    time.Time
	End      time.Time
	Discount int // percent
}


//-----------------------------------------------------------------------

// Version 1

func Price1(order *Order, sale *Sale) int {
	if order.Date.Before(sale.Start) || sale.End.Before(order.Date) {
		return order.Quantity * order.Item.UnitPrice
	}
	return order.Quantity * order.Item.UnitPrice * (100 - sale.Discount) / 100
}


// Version 2

func NotOnSale(order *Order, sale *Sale) bool {
	return order.Date.Before(sale.Start) || sale.End.Before(order.Date)
}

func Price2(order *Order, sale *Sale) int {
	if NotOnSale(order, sale) {
		return order.Quantity * order.Item.UnitPrice
	}
	return order.Quantity * order.Item.UnitPrice * (100 - sale.Discount) / 100
}


// Version 3

func OnSale(order *Order, sale *Sale) bool {
	return sale.Start.Before(order.Date) && order.Date.Before(sale.End)
}

func Price3(order *Order, sale *Sale) int {
	if OnSale(order, sale) {
		return order.Quantity * order.Item.UnitPrice * (100 - sale.Discount) / 100
	}
	return order.Quantity * order.Item.UnitPrice
}


// Version 4

func Price4(order *Order, sale *Sale) int {
	price := order.Quantity * order.Item.UnitPrice
	if OnSale(order, sale) {
		price = price * (100 - sale.Discount) / 100
	}
	return price
}


// Version 5

func Discount(price int, discount int) int {
	return price * (100 - discount) / 100
}

func Price5(order *Order, sale *Sale) int {
	price := order.Quantity * order.Item.UnitPrice
	if OnSale(order, sale) {
		price = Discount(price, sale.Discount)
	}
	return price
}

More on Concurrency

Shared memory is bad for concurrency. Concurrent reads are okay, but modification requires exclusive access (no other reads or writes).

A mutex provides lock and unlock methods such that a thread (or goroutine) can only lock the mutex when no other thread is "holding the lock". If another thread is already holding the lock, the thread requesting the lock waits until the other thread releases the lock. When the thread holding the lock is finished with the mutex, it calls the release method, allowing other threads to take the lock

A programmer wishing to protect a block of shared memory couples the memory with a mutex, acquiring the lock before using the memory and releasing the lock when done with the memory. This requires discipline on the part of the programmer: failure to use the lock at a single location will result in subtle, difficult-to-find bugs.

It is less error-prone to make the mutex a class member and force access to the object exclusively through methods that will control the locking and unlocking of the mutex.

Sources of shared memory:

Lexical closures occur when a function literal makes references to variables defined in the outer enclosing (lexical) scope:

func VeryBad () int {
	// Concurrent access to n via 2 separate goroutines: not a
	// good idea
	int n;
	go func () {
		for i := 0; i < 10; i++ {
			n = n + i
		}
	}()
	go func () {
		for i := 9; i >= 0; i++ {
			n = n - i
		}
	}()
	// Using sleep to wait for the goroutines isn't a particularly
	// good idea either.  sync.WaitGroup was designed exactly for
	// this.
	time.Sleep(30 * time.Second)
	return n;
}

Channels

A channel is primarily a communication mechanism, but it is also a syncronization mechanism. When sending to an unbuffered channel, the goroutine is blocked until another goroutine receives from the channel. Similarly, if a goroutine tries to receive from the channel before another goroutine sends, the receving channel blocks. This forms a barrier mechanism.

A buffered channel has slightly different semantics: a receving thread blocks when the buffer is empty and a sending thread blocks when the buffer is full.


// create channel of type T:
c := make(chan T)

// create buffered channel of type T
c := make(chan T, bufSize)

// send to channel
c <- t

// receive from channel
t := <-c

// select on several channels (first one wins)
select {
case t1 := <- c1:
	//...
case t2:= <-c2:
	//...
}

// poll several channels (first one wins)
select {
case t1 := <- c1:
	//...
case t2:= <-c2:
	//...
default:
	//...
}

// range on channel (until close(c))
for t := range(c) {
	//...
}

A channel may be used as a mutex.

Communication by Message Passing

An alternate approach to concurrency is to avoid concurrent memory access entirely and communicate by message passing. Go channels allow this approach.

Go language motto: "don't communicate by sharing memory; share memory by communicating".

Concurrent access to a map, say for a counter is easily implemented using sync.RWMutex, but an alternate approach is to have a single thread responsible for the map. All other threads send a message to the counter-manager thread.

package counter

type request struct {
	resp      chan int
	increment bool
	key       string
	delta int
}

type Counter struct {
	req chan request
}

func New() *Counter {
	c := &Counter{
		req: make(chan request),
	}
	go counter(c)
	return c
}

func (c *Counter) Get(key string) int {
	resp := make(chan int)
	c.req <- request{
		resp:      resp,
		increment: false,
		key:       key,
	}
	return <-resp
}

func (c *Counter) Incr(key string, delta int) {
	c.req <- request{
		resp:      nil,
		increment: true,
		key:       key,
		delta: delta,
	}
}

func counter(c *Counter) {
	data := make(map[string]int)
	for req := range c.req {
		if req.increment {
			data[req.key] = data[req.key] + req.delta
		} else {
			req.resp <- data[req.key]
		}
	}
}

Timeouts

A requirement for a robust distributed system is that all remote operations have timeouts. A failure in a back-end processor should not cause the front-end to timeout. The front-end can handle retry logic or report back a failure message.

The http.Client has a Timeout field for controlling this. Unfortunately, exec.Cmd does not, but it does have the ability to kill the subprocess.

Realistically, the two external programs we looked at, uuidgen and df, are not going to hang, but the technique was demonstrated in the notes for lecture 4

Essentially, the technique is to run the external command in a separate goroutine, sending the status through a channel. The main thread also sets a timer (which runs in its own goroutine).

There are two possible outcomes handled by the select statement. Either the command completes normally (in which case we cancel the timer thread), or the timer thread sends on its channel and we have to kill the external command.

On timeout, we also want to receive from the command result channel to allow completion of the command's wrapping goroutine.

package main

import (
	"fmt"
	"net/http"
	"os/exec"
	"time"
)

const (
	timeout = 2 * time.Second
)

func stats(response http.ResponseWriter, request *http.Request) {
	response.Header()["content-type"] = []string{"text/plain"}

	var (
		done    = make(chan error)
		bytes   []byte
		err     error
		command = exec.Command("df", "-h")
	)

	// Execute the command in a separate goroutine so the main
	// thread can timeout gracefully.
	go func() {
		var err error
		bytes, err = command.Output()
		done <- err
	}()

	// Start the clock.
	timer := time.NewTimer(timeout)

	// Wait for the command to complete or time out.
	select {
	case err = <-done:
		// The goroutine completed or failed.  Stop the timer
		// so its goroutine can terminate.
		timer.Stop()
	case <-timer.C:
		// Too late.  Kill the process so the goroutine can
		// complete.  Otherwise, we'd have a potential leak.
		if err := command.Process.Kill(); err == nil {
			// Catch the done message without blocking, so
			// it won't leak.
			go func() { <-done }()
		}
		err = fmt.Errorf("timeout")
	}

	// Did it work?
	if err != nil {
		http.Error(response, err.Error(), http.StatusInternalServerError)
		return
	}
	response.Write(bytes)
}

func main() {
	http.HandleFunc("/stats", stats)
	err := http.ListenAndServe(":8080", nil)
	fmt.Printf("Server fail: %s\n", err)
}