CSS 343: Notes from Lecture 2 (DRAFT)

Administrivia

Coding Style

Abstraction

Model of Compilation

Pointer–Array Equivalence

Assignment 1

Casting

Linked List

Forgetting about the memory allocator for the moment, we have this problem that we want to count the frequency of occurrences of words in a text. Your pointy-haired boss has told you to use linked lists. You took CSS 342 and know what linked lists are, so you grudgingly accept the mission.

The payload of your linked list consists of two elements: a string representing the word and an int holding the tally.

You know that strings in C are represented in an array of char values terminated by the special NUL value (also written as '\0', but never NULL). You ask your PHB what is the longest word in his vocabulary and he tells you, 19.

You come up with an initial design for your linked list element:

struct ListNode {
   char data[20];
   int count;
   struct ListNode* next;
};
      

20 is a magic number so you decide to make it look a bit more sensible:

#define LONGEST_WORD 19
struct ListNode {
   char data[LONGEST_WORD + 1];
   int count;
   struct ListNode* next;
};
      

Now you write your function to allocate and construct a node:

#include <cstring>
ListNode* allocate_node(char *word) {
   ListNode* new_node = malloc(sizeof(ListNode));
   strcpy(new_node->data, word);
   new_node->count = 1;
   new_node->next = NULL;
}
      
Clearly you weren't paying attention in your Software Engineering class because you didn't do anything about handling the (rare) case where malloc() fails and returns NULL.

Of course, if your boss didn't have a phobia about C++, you might write this:

struct ListNode {
   ListNode (char *word) : count(1), next(NULL) {strcpy(data, word);}

   static ListNode* allocate(char * word) {
      return new ListNode(word);
   }

   char data[LONGEST_WORD + 1];
   int count;
   struct ListNode* next;
};

ListNode node = ListNode::allocate("foobar");
      

Of course, your PHB forgot that some people know even longer words. What happens when you try to create a ListNode with a longer word? You overflow the array and start scribbling all over count and then next and then who-knows-what.

You try to fix this by changing around the order of elements:

struct ListNode {
   struct ListNode* next;
   int count;
   char data[LONGEST_WORD + 1];
};
      

That way, you're only scribbing over who-knows-what. Okay, so that's not exactly an improvement. So you decide there's never going to be a 100-character word.

#define NODE_SLOP 100
ListNode* allocate_node(char *word) {
   ListNode* new_node = malloc(sizeof(ListNode) + NODE_SLOP);
   strcpy(new_node->data, word);
   new_node->count = 1;
   new_node->next = NULL;
}
      

Now, as long as the word is less that 120 characters or so, you're only scibbling over your own slop.

Finally, we come up with this robust solution:

struct ListNode {
   struct ListNode* next;
   int count;
   char data[0];
};

ListNode* allocate_node(char *word) {
   ListNode* new_node = malloc(sizeof(ListNode) + strlen(word) + 1);
   strcpy(new_node->data, word);
   new_node->count = 1;
   new_node->next = NULL;
}
      

Much to your surprise, this is actually legal C/C++ code, enshrined in the standard document. In fact, the compiler may allocate some space in ListNode for data to give it a unique address in case you ever take the address of the data field.

Memory Allocator

Coming soon. Watch this space.

Java has new/... (garbage collection) C++ has new/delete C has malloc/free