Linked lists should be familiar to you from CSS 342, so this exercise is partially review. What is new in this assignment is the requirement for low-level memory management.
The purpose of this assignment is to implement a simple,
low-level, concrete data structure using
the C subset of C++. Do
not
use any standard library functions except for C or C++ style I/O
and (optionally) the C strings library functions
#include <cstring>
. You may use any standard
library data type (Plain Old Data, not C++ classes).
Unadorned output should be sent to standard output. Any logging/trace/diagnostic/status/verbose messages should go to standard error. The program will be run against the instructor's test data with the output compared to a reference file so standard output needs to be clean.
Submit a script named
Run
that will compile the source and run your test cases. Include any
data files needed to run your tests.
In addition to the executable program,
for for each source file
foo.c
,
Run
should generate
foo.i
foo.s
foo.o
Name the excutable
wordcount
so it will work with the instructor's test harness.
Optionally, submit a text file named
README
with any relevant notes.
The script should run on Linux. Bash shell scripts are an
appropriate choice. For extra credit, include a
Makefile
and have
Run
invoke
make
.
Implement a memory allocator per the functions declared in
allocator.h
.
Do not use
malloc()
or new
to allocate memory. Instead, define the array
Arena[]
to be of size
ARENA_SIZE
.
Memory will be allocated and deallocated from
Arena[]
.
You will have to keep track of the blocks of free memory, spliting them as memory is allocated. When memory is allocated, the allocator needs to retain metadata about the block so that it can be returned to the pool.
Allocated memory should be aligned with pointer boundaries.
If memory cannot be allocated for some request, terminate the program with an error exit status.
To implement the allocator, you will have to convert between
pointers and integer types in order to compute offsets. Ensure
that your integral type is large enough to hold a pointer value
(hint: consider the standard library types
ptrdiff_t
and
size_t
).
Write a program that will insert counters and C-style strings
(array of
char
ending with a NUL character) into a singly-linked circular list,
according to the functions declared in
list.h
.
Define the node structure like this:
struct ListNode {
struct ListNode* next;
int count;
char data[0];
};
Note the zero-length
data
field. In C, even empty strings contain a byte for the NUL
character. This means that any string will overflow
struct
.
When allocating a
ListNode
,
you need to make sure to allocate enough space for the overflow.
This also means you cannot copy or reuse nodes.
Use the memory allocator from part 1.
The driver program should read data from standard input.
Each line contains a code character
a
or
d
followed by a space, followed by a word.
a
,
check to see if it is already in the list. If found,
increment its counter. Otherwise, add the word to the list
with count 1.
d
,
find the word in the list and decrement its counter. If the
counter drops to zero, delete the word from the list. If the
word is not found, the program should terminate with an error
exit status.
The “word” and hence the input line could be arbitrarily long. In practice, it is probably going to be less than, say 127 characters. You may terminate the program with an error exit status if the word is 128 characters or greater.
You may want to write a helper function,
locate()
which will return the predecessor node of a given word (if found).
After the input is read, walk through the list, printing each word
and its count. The output format should be compatible with
uniq -c
(7-character right-justified count field, space, word)
but order does not matter.
Make sure you cover the following test cases:
realloc()
for this case.
malloc()
or
Doug Lea's memory allocator
.
malloc()
and
free()
.
This thin wrapper makes benchmarking and validation easier.
ARENA_SIZE
on the compiler command line (in gcc/g++)?
ARENA_SIZE
be defined only on the command
line? If ARENA_SIZE
is not defined on command
line, what is a clean way to force a compile-time error?
malloc()
returns a pointer to allocated memory which is passed back to
free()
.
How does free know the size of the block that was
returned? How does a negative pointer offset work? Is that
legal according to the C/C++ language standards?
struct
.
How does that work? How does a zero-length field affect the
size of the structure? Is this legal according to the language
standard?