CSS 432
FAQ on Program 3: TCP Analysis


Q1: In how details should I draw a timing chart for hw3?

A: As much as possible.

dumptcp.sh shows all ip packets exchanged between the local host and a given remote host. For each packet, indicate all the information including timestamps, message sequence numbers, ip addresses, port numbers and some notations such as S, ., P, and F.

Q2: Can I reuse the code from HW1?

A: Yes

Q3: I don't know how to use shutdown().

A: Type "man 2 shutdown".

Actually all you need is shutdown( serverSd, SHUT_WR ); where serverSd is a socket descriptor obtained from the server's accept() call.

Q4: Do I have to use TCP_NODELAY in hw3.cpp like in hw1?

A: No, you should not use TCP_NODELAY.

Q5: When does an IP packet have a "P" notation. When does an IP packet have a "." notation.

A: An IP packet that carries the last data of a given write() call has a P (Push) option.

If you try to send data whose size is more than MSS, it is segmented into several packets. All those packets except the very last one has a "." notation, but the last one has a "P" notation indicating that this is the last data of a given write() call.

Q6: Parameters passed to ttcp

I have a question about ttcp. You stated that the format for the arguments for ttcp is as follows:
       -l#     length of bufs read from or written to network (default 8192)
       -b#     set the socket buffer size if supported (default is 16384)
       -p#     specify another service port (default is 5001)
Should we have a space between the argument character and the number?

A: No, the number should follow the option without a space.

Q7: For Analysis 1 and 3, we are to make a graph or table in terms of Mbps. Should we look at the client or server Mbps? I know the two do not greatly differ.

A: Focus on the client Mbps.

Q8: Does the nagle-off option slow down?

I've got an issue with strace and I'm not sure if it's suppose to be like this or not. When I'm running strace -c -e trace=write ./ttcp -l 64 -n 104857 and tcpdump at the same time (test#4), the finish times are really slow. I'm getting Mbps from 6.78407 (nagle off) to 10.3316 (nagle on). It takes 18 microseconds per write for nagle on and 51 microseconds per write for nagle off. Should it be that slow (almost 1 minute to finish with nagle off)? If I don't run strace, it's faster (~90 mbps for nagle on, ~15 mbps for nagle off). So are the results truly accurate when I'm running strace?

A: With the nagle off, the performance should be degraded as you saw in your correct results.

This is because the OS needs a certain time to transmit data to NIC, and the nagle-off situation requires more frequent writes which increases this OS overhead.

Q9: Using strace, the times are considerably slower than if I don't use strace

A: Since strance traces every single OS system call and prints it out to the display, it causes more frequent interrputs and thus slows down.

Q10: There are two independent sequences of packet IDs.

I expected there would be packets of data being sent to my server besides, but since I'm only listening on port 5001 I'm confused as to why I'm seeing the packet IDs change as if there were two sockets on the same port:
1273986401.734004 IP (tos 0x0, ttl  64, id 9891, offset 0, flags [DF], proto: TCP (6), length: 52) uw1-320-10.uwb.edu.35853 > uw1-320-11.uwb.edu.commplex-link: ., cksum 0x63d0 (correct), ack 1 win 46 < nop,nop,timestamp 98161313 99209533>
1273986401.734020 IP (tos 0x0, ttl  64, id 9892, offset 0, flags [DF], proto: TCP (6), length: 56) uw1-320-10.uwb.edu.35853 > uw1-320-11.uwb.edu.commplex-link: P, cksum 0x6344 (correct), 1:5(4) ack 1 win 46 < nop,nop,timestamp 98161313 99209533>
1273986401.734028 IP (tos 0x0, ttl  64, id 11268, offset 0, flags [DF], proto: TCP (6), length: 52) uw1-320-11.uwb.edu.commplex-link > uw1-320-10.uwb.edu.35853: ., cksum 0x63cc (correct), ack 5 win 46 < nop,nop,timestamp 99209533 98161313>
1273986401.734272 IP (tos 0x0, ttl  64, id 9893, offset 0, flags [DF], proto: TCP (6), length: 1500) uw1-320-10.uwb.edu.35853 > uw1-320-11.uwb.edu.commplex-link: . 5:1453(1448) ack 1 win 46 < nop,nop,timestamp 98161313 99209533>
1273986401.734302 IP (tos 0x0, ttl  64, id 11269, offset 0, flags [DF], proto: TCP (6), length: 52) uw1-320-11.uwb.edu.commplex-link > uw1-320-10.uwb.edu.35853: ., cksum 0x5e0e (correct), ack 1453 win 68 < nop,nop,timestamp 99209533 98161313>

A. Both a client and the corresponding server have their own packet IDs.

Q11: hw3 bad checksum question

I notice that at a certain point one packet always returns with a bad checksum, its always the same packet in he same place. Are we supposed to hard code a bad checksum in our program? That seems like a weird thing to do so I wanted to make sure.

A: Just ignore this problem. :-)

As mentioned in the last class, this checksum error occured due to a packet returned from the server to the client. Note that tcpdump is checking only the data correctness of packets from the client to the server. So, ignore this problem.

Q12: For the output from the strace -ttT ttcp call, I forgot how we are supposed to interpret the data

Here's some sample output:
------------------------------------------------------------------------------------------------------------------------------
 ...
19:16:29.574171 write(3, "\4\0\0\0\3245\212\0\270\305\266\277i>\204\0\3\0\0\0\1\0\0\0\4\0\0\0\0\0\0\0"..., 64) = 64 < 0.000018>
19:16:29.574265 write(3, "\4\0\0\0\3245\212\0\270\305\266\277i>\204\0\3\0\0\0\1\0\0\0\4\0\0\0\0\0\0\0"..., 64) = 64 < 0.000015>
19:16:29.574343 write(3, "\4\0\0\0\3245\212\0\270\305\266\277i>\204\0\3\0\0\0\1\0\0\0\4\0\0\0\0\0\0\0"..., 64) = 64 < 0.000014>
 ...
------------------------------------------------------------------------------------------------------------------------------

I remember hearing something about the timestamp in the beginning (i.e. 19:16:29.574171),
and something about the stuff at the end of the line (i.e. < 0.000018>)?

A: Each lap time is time elapsed to handle this write system call.

The timestamp in the beginning is the time at which the system call was called, the time at the right is a lap timestamp that shows the total time for the system call to complete. For example, for the first write() call:
start time: 19.16.29.574171
+lap time:         0.000018
end time:   19.16.29.574189
Notice the end time of the first system call is a bit behind the start time of the next system call. Additional .000076 is the kernel doing other jobs (such as process scheduling). To get more accurate results, as I mentioned a bit in the last class, you would like to get the start time of the very first write(3, ...) call and the lap time of thte very last write(3, ...), and to calculate the average.