Version 12 (modified by, 8 years ago) (diff)


TCP Experiment


In this experiment you will learn how to set up static routing with the route command. We will use the following network topology for this experiment:
TCP topology


This tutorial expects that you have completed lab 0 and know how to create a new sliver with an existing rspec. If you have not completed lab 0 please do that first.


If you are using ExoGENI you may have to run
sudo apt-get install iperf
to install iperf. All other tools will already be installed at your nodes.

Where to get help:

For any questions or problem with the tutorial ask your TA or Professor for help. If a GENI tool is not working correctly please email


Set Up

If you are using Portal, create a sliver with the tcp-openvz rspec on InstaGENI. For ExoGENI, download the rspec here and specify -a eg-gpo in omni to specify which aggregate manager to use.

Note: when you try to login the nodes, you might encounter errors like the following:

Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
Please contact your system administrator.
Add correct host key in /Users/shuang/.ssh/known_hosts to get rid of this message.
Offending RSA key in /Users/shuang/.ssh/known_hosts:20
RSA host key for has changed and you have requested strict checking.
Host key verification failed.

This type of error happens when you have previously logged in a machine with the same IP address and that machine happens not to be the same one as the VM you want to log in.
As a result, this error might happen a lot, especially if you have done experiments on ExoGENI before (since it always uses the same IP range for its VMs and each VM is different from each other, i.e., with different RSA keys).
In this case, simply open ~/.ssh/known_hosts and remove the line corresponding to the IP address you are trying to login. In our case here, find and remove that line.
Note that the nodes are ready to login does not mean the interfaces are up and well configured. Users may need to wait for up to 5 minutes after nodes are up.


Now that we have reserved the nodes. Let's log on to each nodes and do experiment.
You can find the nodes you reserved from the output of "createsliver" in omni.
Or you can use "" to show the topology as well as the log in commands.
Or, if you are a GENI Portal user, use the "details" button to check details of your slice.

Useful commands:
Change the use of congestion control algorithm:

echo reno | sudo tee /proc/sys/net/ipv4/tcp_congestion_control
echo cubic | sudo tee /proc/sys/net/ipv4/tcp_congestion_control

Change the delay/loss of a particular interface:

sudo /sbin/tc qdisc add dev eth1 root handle 1:0 netem delay 200ms loss 5%

Restore network delay/loss on nic card:

sudo /sbin/tc qdisc del dev eth1 root

It is a little bit tricky to configure delay/loss on a virtual machine
Step 1: find our qdisc family number by executing "sudo /sbin/tc qdisc", a sample output could be like the following:

[shufeng@center ~]$ sudo /sbin/tc qdisc
qdisc htb 270: dev mv6.47 root refcnt 2 r2q 10 default 1 direct_packets_stat 0
qdisc netem 260: dev mv6.47 parent 270:1 limit 1000
qdisc htb 150: dev mv6.41 root refcnt 2 r2q 10 default 1 direct_packets_stat 0
qdisc netem 140: dev mv6.41 parent 150:1 limit 1000
qdisc htb 190: dev mv6.43 root refcnt 2 r2q 10 default 1 direct_packets_stat 0
qdisc netem 180: dev mv6.43 parent 190:1 limit 1000
qdisc htb 230: dev mv6.45 root refcnt 2 r2q 10 default 1 direct_packets_stat 0
qdisc netem 220: dev mv6.45 parent 230:1 limit 1000

Now if the ethernet card you want to change is mv6.43, you can find from following line:

qdisc htb 190: dev mv6.43 root refcnt 2 r2q 10 default 1 direct_packets_stat 0
qdisc netem 180: dev mv6.43 parent 190:1 limit 1000

As a result, you change the delay/loss by executing the following:

sudo /sbin/tc -s qdisc change dev mv6.43 parent 190:1 handle 180: netem limit 1000 delay 100ms loss 5%
sudo /sbin/tc -s qdisc change dev mv6.43 parent 190:1 handle 180: netem limit 1000

GENI nodes provide two TCP congestion control algorithms, CUBIC and Reno, that can be chosen at run-time.
The list of available algorithms are listed in the file /proc/sys/net/ipv4/tcp_available_congestion control.
The “Reno” congestion control provided by the Linux kernel is actually the NewReno algorithm, but we will refer to it as Reno here to be consistent with Linux terminology.
Note that congestion control actions are very similar between Reno and NewReno, but NewReno has a more nuanced approach to loss recovery..
These congestion control algorithms can be chosen by placing the keywords reno or cubic in the file /proc/sys/net/ipv4/tcp_congestion_control. For example, to configure a host to use the Reno algorithm, use:

echo reno | sudo tee /proc/sys/net/ipv4/tcp_congestion_control

If you get a permission error on InstaGENI first run

sudo bash

The tc command will then be used to set up network conditions for observation and testing. For example, if eth1 is the physical interface representing the link L on the Center node, the following command on the Center node will add a 200 ms delay to all packets leaving the interface:

sudo /sbin/tc qdisc add dev eth1 root handle 1:0 netem delay 200ms

Specific network setup commands will be provided as needed.


3.1 Comparison of Reno and CUBIC:

Run an Iperf server on the Left node. On InstaGENI this will be

/usr/local/etc/emulab/emulab-iperf -s 

On ExoGENI the command is

 iperf -s

The Iperf client will be run on the Right node. On InstaGENI:

/usr/local/etc/emulab/emulab-iperf -c -t 60

or ExoGENI:

iperf -c -t 60

Note that the IP address specified in the command on the Right node should be the IP of the Left node and may be different from this example. The duration for an Iperf session (-t option) is 60 seconds unless otherwise mentioned. Note carefully that some exercises require a much longer duration. Ensure that your sliver lifetimes are long enough to capture the duration of your experiment. All of the experiments should be repeated at least a 5 times (especially when the interfaces include random delays or losses) to ensure confidence in the results, as transient conditions can cause significant variations in any individual run.

  1. Question: What are the goodputs (throughputs as seen in iperf) when the Reno and CUBIC algorithms are used on the network with no emulated delay or loss? Which is better?
  2. Question: Qualitatively, under what conditions does BIC/CUBIC perform better than Reno’s AIMD?
  3. Question: Change the delay to of interface L to 300 ms using the following command, and run an Iperf session for 1800 seconds.
    sudo /sbin/tc qdisc add dev L root handle 1:0 netem limit 1000000000 delay 300ms
    where the interface L is the interface on the traffic controller connected to the Left node. What are the goodputs of Reno and CUBIC? Which performed better? What do you conclude?
  4. Question: Repeat the above experiment with 30 parallel connections and 1800 seconds for each algorithm by using the -P 30 option on iperf. How do CUBIC and Reno differ? What do you conclude?
  5. Question: Remove the netem queueing discipline which causes delay and add a loss of 5% by using the following commands on the center node. Replace L with the appropriate physical interface. Alternatively, one can change a queueing discipline instead of deleting and adding a new one.
    sudo /sbin/tc qdisc del dev L root
    sudo /sbin/tc qdisc add dev L root handle 1:0 netem loss 5%
    How do the goodputs of Reno and CUBIC differ under loss for 60 s Iperf sessions?

3.2 Ensuring Fairness Among Flows

Restore the network state with the following command:

sudo /sbin/tc qdisc del dev L root

Run an Iperf client on the Right node with 10 parallel TCP connections (use the -P option), connecting to an Iperf server on the Left node for 60 seconds. Simultaneously, run a 20 Mbps UDP Iperf client on the Top node connecting to an UDP Iperf server session running on the Left node for 60 seconds.

  1. Question: What are the throughput shown by the UDP and TCP Iperf server sessions? Why are they what they are?
  2. Question: Provide the necessary steps and commands to enable queueing disciplines that enforce fairness among all the 11 flows in the network, and demonstrate that your solution is effective.

3.3 Reordering

Delete the previous queuing discipline and use the following netem configuration on interface L to create an 100 ms delay:

sudo /sbin/tc qdisc del dev L root
sudo /sbin/tc qdisc add dev L root handle 1:0 netem delay 100ms

As before, run a TCP Iperf client on the Right node connecting an Iperf server on the Left for 60 seconds.

  1. Question: What is the TCP goodput?
  2. Question: Introduce packet reordering, adding a 75 ms delay variance to the interface L with the following command:
     sudo /sbin/tc qdisc change dev L root handle 1:0 netem delay 100ms 75ms
    What is the TCP goodput now?
  3. Question: By tweaking the parameters in the file /proc/sys/net/ipv4/tcp_reordering, how much can the TCP goodput be improved? What is the best goodput you can show? Why is too high or two low value bad for TCP?

3.4 Performance of SACK under Lossy Conditions

Using Cubic as the congestion avoidance algorithm, set the loss characteristics on interface L using the following commands:

sudo /sbin/tc qdisc del dev L root
sudo /sbin/tc qdisc add dev L root handle 1:0 netem loss 10%
  1. Question: What kind of goodput do you get using CUBIC with SACK (the default configuration)? Why do you see this performance?
  2. Question: Disable SACK at the sender using this command:
      echo 0 | sudo tee /proc/sys/net/ipv4/tcp sack
    What is the goodput without SACK? In what circumstances is SACK most beneficial? Remember that, due to the random nature of loss events, these experiments must be repeated at least five times to draw any conclusions.

3.5 An Experimental Congestion Avoidance module for Linux

Source code needed (to-be changed by you):

  • Makefile
  • tcp_exp.c
    In this exercise, you will develop and evaluate a TCP congestion control module for the Linux kernel. Linux provides a pluggable interface for TCP congestion control, which allows named congestion control modules to manipulate its sending rate and reaction to congestion events. You have already used the reno and cubic modules, and in this exercise you will create one named exp.
    Linux kernel modules must be compiled against kernel source that matches the kernel into which the module will be loaded. In order to prepare your ProtoGENI host for kernel module development, follow these steps:
  1. Comment out the line:
    exclude=mkinitrd* kernel*
    in the file /etc/yum.conf, to allow yum to install kernel headers.
  2. Install the required packages with this command:
    sudo yum install kernel-devel kernel-headers
  3. Fix up the kernel version in the installed headers to match the running kernel; this can be tricky, but these steps should handle it. (a) Find your kernel sources. They are in /usr/src/kernel, in a directory that depends on the installed version. As of the time this handout was created, that directory is We will call this directory $KERNELSRC. (b) Identify your running kernel version by running uname -r. It will be something like The first three dotted components (2.6.27, in this case) are the major, minor, and micro versions, respectively, and the remainder of the version string (.5-117.emulab.fc10.i686) is the extraversion. Note the extraversion of your kernel. (c) In$KERNELSRC/Makefile,find the line beginning with EXTRAVERSION. Replace its value with the extraversion of your kernel. (d) Update the kernel header tree to this new version by running the command:
    sudo make include/linux/utsrelease.h
    More details to handle version issues are provided at Building modules for a precompiled kernel.
    A Makefile for compiling the module and the source for a stub TCP congestion control module are included in Makefile.
    The module is named tcp exp (for experimental TCP), and the congestion control algorithm is named exp. Comments in the provided source file explain the relationship between the various functions, and more information can be found in Pluggable congestion avoidance modules.
    The compiled module (which is built with make and called tcp_exp.ko) can be inserted into the kernel using insmod. It can be removed using the command rmmod tcp_exp and reloaded with insmod if changes are required.
    Once the module is complete and loaded into the kernel, the algorithm implemented by the module can be selected in the same manner that reno and cubic were selected in previous exercises, by placing the keyword exp in /proc/sys/net/ipv4/tcp_congestion_control.

3.5.1 Algorithm Requirements
The experimental congestion control module is based on Reno, but has the following modifications:

  • It uses a Slow Start exponential factor of 3. Reno uses 2.
  • It cuts ssthresh to 3 × FlightSize/4 when entering loss recovery. Reno cuts to FlightSize/2.

3.5.2 Hints
These hints and suggestions may help you get started.

  • The existing congestion avoidance modules are a good start. See net/ipv4/tcpcong.c in the Linux source for the Linux Reno implementation.
  • The file net/ipv4/tcp_input.c is a good place to learn how the congestion avoidance modules are used and invoked.
  • RFC 5681 specifies the Reno congestion control actions in detail, and may be helpful in understanding the kernel code.
  • The Linux Cross Reference at may be useful for navigating and understanding how the code fits together.
  • If one of the hosts becomes unresponsive due to a bug in your congestion control module, you can restart the sliver to reboot it.
  • The Linux Kernel Module Programming Guide provides a good introduction to kernel module programming in general.

3.5.3 Evaluation
Once you have implemented the algorithm described above, answer the following questions:

  1. Question: Discuss the impact of these algorithmic changes in the context of traditional Reno congestion control.
  2. Question: Compare the convergence time and fairness of your algorithm with Reno and Cubic under (a) high delay (500 ms) and (2) high loss (5%) conditions. Use Jain’s fairness index, or some other quantitative measure of fairness, in your comparison.


Note: Before you delete your experiment, be sure to save all the experimental results and any source code that you may need in the future to your local machine!'''

If you are using omni, release the reserved resources by executing: deletesliver -a pg-utah shufengTCP deleteslice shufengTCP

For Portal users, simply click "delete resources" on your slice page and wait until all 7 aggregate managers are finished.

Attachments (3)

Download all attachments as: .zip