CSS 432
FAQ on Program 3: TCP Analysis
Q1: In how details should I draw a timing chart for hw3?
A: As much as possible.
dumptcp.sh shows all ip packets exchanged between the local
host and a given remote host. For each packet, indicate all the
information including timestamps, message sequence numbers, ip
addresses, port numbers and some notations such as S, ., P, and F.
Q2: Can I reuse the code from HW1?
A: Yes
Q3: I don't know how to use shutdown().
A: Type "man 2 shutdown".
Actually all you need is shutdown( serverSd, SHUT_WR ); where
serverSd is a socket descriptor obtained from the server's
accept() call.
Q4: Do I have to use TCP_NODELAY in hw3.cpp like in hw1?
A: No, you should not use TCP_NODELAY.
Q5: When does an IP packet have a "P" notation. When does an IP
packet have a "." notation.
A: An IP packet that carries the last data of a given write()
call has a P (Push) option.
If you try to send data whose size is more than MSS, it is segmented
into several packets. All those packets except the very last one has a
"." notation, but the last one has a "P" notation indicating that this
is the last data of a given write() call.
Q6: Parameters passed to ttcp
I have a question about ttcp. You stated that the format for the
arguments for ttcp is as follows:
-l# length of bufs read from or written to network (default 8192)
-b# set the socket buffer size if supported (default is 16384)
-p# specify another service port (default is 5001)
Should we have a space between the argument character and the number?
A: No, the number should follow the option without a space.
Q7: For Analysis 1 and 3, we are to make a graph or table in terms
of Mbps. Should we look at the client or server Mbps? I know the two
do not greatly differ.
A: Focus on the client Mbps.
Q8: Does the nagle-off option slow down?
I've got an issue with strace and I'm not sure if it's suppose to be
like this or not. When I'm running
strace -c -e trace=write ./ttcp -l 64 -n 104857 and
tcpdump
at the same time (test#4), the finish times
are really slow. I'm getting Mbps from 6.78407 (nagle off) to
10.3316 (nagle on). It takes 18 microseconds per write for nagle on
and 51 microseconds per write for nagle off. Should it be that
slow (almost 1 minute to finish with nagle off)? If I don't run
strace, it's faster (~90 mbps for nagle on, ~15 mbps for nagle off).
So are the results truly accurate when I'm running strace?
A: With the nagle off, the performance should be degraded as you
saw in your correct results.
This is because the OS needs a certain time to transmit data to NIC, and
the nagle-off situation requires more frequent writes which increases
this OS overhead.
Q9: Using strace, the times are considerably slower than if I
don't use strace
A: Since strance traces every single OS system call and prints it
out to the display, it causes more frequent interrputs and thus slows
down.
Q10: There are two independent sequences of packet IDs.
I expected there would be packets of data being sent to my server
besides, but since I'm only listening on port 5001 I'm confused as to
why I'm seeing the packet IDs change as if there were two sockets on
the same port:
1273986401.734004 IP (tos 0x0, ttl 64, id 9891, offset 0, flags [DF], proto: TCP (6), length: 52) uw1-320-10.uwb.edu.35853 > uw1-320-11.uwb.edu.commplex-link: ., cksum 0x63d0 (correct), ack 1 win 46 < nop,nop,timestamp 98161313 99209533>
1273986401.734020 IP (tos 0x0, ttl 64, id 9892, offset 0, flags [DF], proto: TCP (6), length: 56) uw1-320-10.uwb.edu.35853 > uw1-320-11.uwb.edu.commplex-link: P, cksum 0x6344 (correct), 1:5(4) ack 1 win 46 < nop,nop,timestamp 98161313 99209533>
1273986401.734028 IP (tos 0x0, ttl 64, id 11268, offset 0, flags [DF], proto: TCP (6), length: 52) uw1-320-11.uwb.edu.commplex-link > uw1-320-10.uwb.edu.35853: ., cksum 0x63cc (correct), ack 5 win 46 < nop,nop,timestamp 99209533 98161313>
1273986401.734272 IP (tos 0x0, ttl 64, id 9893, offset 0, flags [DF], proto: TCP (6), length: 1500) uw1-320-10.uwb.edu.35853 > uw1-320-11.uwb.edu.commplex-link: . 5:1453(1448) ack 1 win 46 < nop,nop,timestamp 98161313 99209533>
1273986401.734302 IP (tos 0x0, ttl 64, id 11269, offset 0, flags [DF], proto: TCP (6), length: 52) uw1-320-11.uwb.edu.commplex-link > uw1-320-10.uwb.edu.35853: ., cksum 0x5e0e (correct), ack 1453 win 68 < nop,nop,timestamp 99209533 98161313>
A. Both a client and the corresponding server have their own packet IDs.
Q11: hw3 bad checksum question
I notice that at a certain point one packet always returns with a bad
checksum, its always the same packet in he same place. Are we supposed
to hard code a bad checksum in our program? That seems like a weird
thing to do so I wanted to make sure.
A: Just ignore this problem. :-)
As mentioned in the last class, this checksum error occured due to a
packet returned from the server to the client. Note that tcpdump is
checking only the data correctness of packets from the client to the
server. So, ignore this problem.
Q12: For the output from the strace -ttT ttcp call, I forgot how
we are supposed to interpret the data
Here's some sample output:
------------------------------------------------------------------------------------------------------------------------------
...
19:16:29.574171 write(3, "\4\0\0\0\3245\212\0\270\305\266\277i>\204\0\3\0\0\0\1\0\0\0\4\0\0\0\0\0\0\0"..., 64) = 64 < 0.000018>
19:16:29.574265 write(3, "\4\0\0\0\3245\212\0\270\305\266\277i>\204\0\3\0\0\0\1\0\0\0\4\0\0\0\0\0\0\0"..., 64) = 64 < 0.000015>
19:16:29.574343 write(3, "\4\0\0\0\3245\212\0\270\305\266\277i>\204\0\3\0\0\0\1\0\0\0\4\0\0\0\0\0\0\0"..., 64) = 64 < 0.000014>
...
------------------------------------------------------------------------------------------------------------------------------
I remember hearing something about the timestamp in the beginning (i.e. 19:16:29.574171),
and something about the stuff at the end of the line (i.e. < 0.000018>)?
A: Each lap time is time elapsed to handle this write system call.
The timestamp in the beginning is the time at which the system call
was called, the time at the right is a lap timestamp that shows the
total time for the system call to complete. For example, for the first
write() call:
start time: 19.16.29.574171
+lap time: 0.000018
end time: 19.16.29.574189
Notice the end time of the first system call is a bit behind the start
time of the next system call. Additional .000076 is the kernel doing
other jobs (such as process scheduling).
To get more accurate results, as I mentioned a bit in the last class,
you would like to get the start time of the very first write(3, ...)
call and the lap time of thte very last write(3, ...), and to
calculate the average.