wiki:GeniTmixCalibTutorial

Version 79 (modified by Ben Newton, 5 years ago) (diff)

--

Calibrating Experiments on GENI using the Tmix Traffic Generation Tool

This page describes how to run calibration experiments with Tmix on GENI nodes. This tutorial assumes that you are already familiar with basic experimentation with GENI and that you have access to a machine from which you can execute ssh commands and login to external machines. If you wish to reserve resources using omni, some version of Linux or Unix running on a PC or virtual machine is required. This tutorial also assumes basic familiarity with Linux or Unix, as well as the ability to use a terminal text editor such as emacs, vim or nano.

Before starting, it is important to understand what calibration is, and why it is important. At run-time, Tmix "replays" the exchanges encoded in a set of connection vectors (or c-vecs). The connection vectors are extracted from traffic captured on a real network link. Since all of the traffic crossing a busy link cannot generally be replayed using a single pair of nodes, it is customary to split the connection vectors into N pairs of tcvec files. These can then be used on N pairs of machines to simulate the captured network traffic. If all of the traffic observed on a busy link cannot be simulated with just two nodes, what percentage of the traffic can we replay? How many pairs of nodes do we need to use to simulate all the traffic? It is exactly these questions that calibration seeks to answer. Below, we will walk through the process of determining how much traffic can be simulated by a single pair of nodes.

A. Reserve Resources on GENI Portal

Click the link below for instructions on generating a slice and reserving resources using the GENI Portal. For this tutorial, we will use the “Tmix 10 min experiment” resource specification (rspec). If you are completing this tutorial at a conference, add nodes from the aggregate you were assigned to use by the leader of the tutorial. Otherwise, you may use any of the InstaGENI or ProtoGENI aggregates. The image and setup are the same as in the basic Tmix Tutorial.

Reserve Resources using GENI Portal

B. Log in to nodes

Two nodes have been reserved with the hostnames "left" and "right". Open two new SSH terminals, one to each node. On Linux, you may use the following command to log in to both nodes reserved in the previous step.

ssh -i ~/.ssh/id_geni_ssh_rsa <username>@<hostname> -p <port>

As expected, <username> should be replaced with your GENI username, and <hostname> and <port> should be replaced with the hostname and port of the reserved GENI resource noted in the previous step.

C. Run Tmix Script and view data files

  1. The image loaded on the nodes has the Tmix tools already installed and in your path. Each time the system boots, a kernel module is automatically inserted to assist in simulating the packet delays. List the contents of your home directory on either node.
    ls
    
  2. If you do not see a "tmix.conf" file in your home directory, type:
    /local/tmix-script.sh
    

This script generates a "tmix.conf" file and attempts to re-insert the kernel module, which it may report is already loaded.

  1. Repeat step 2 for the other node.
  1. Open the "tmix.conf" file with an editor such as vim, emacs, or nano, and browse its contents. The entries represent various settings that control a Tmix experiment. At the end of the file, notice the Crecv_Trace and Cinit_Trace lines. These are the only two lines you will need to change as we run our calibration experiments.
  1. List the contents of the Tmix data directory by typing the following on either node:
    ls /opt/tmix-1.2/data
    

A set of connection vectors is described by a pair of tcvec files (labeled cinit.tcvec and crecv.tcvec). These two files correspond to connections that are initiated on either side of the link on which the traffic data was originally captured. Since all of the traffic crossing a busy link cannot generally be replayed using a single pair of nodes, it is customary to split the connection vectors into N pairs of tcvec files. Most of the files displayed in the listing are named 1ofN.crecv.tcvec. Pairs of connection vector files are obtained by evenly splitting an original set of connection vectors into N parts. For example, running Tmix with the pair of files 1of10.crecv.tcvec and 1of10.cinit.tcvec will replay about 1/10th the traffic originally recorded on the link. Below, we will perform a set of experiments in which we iterate to find the point at which our pair of nodes can no longer work fast enough to simulate all the network traffic.

D. Run Tmix Script and view data files

  1. Make a new directory for this experiment, change to it, and copy "tmix.conf" there by typing:
    mkdir 1of10
    cd 1of10
    cp ../tmix.conf .
    
  1. Repeat for the other node.
  1. Edit the copied tmix.conf file on both nodes using a terminal text editor such that your Crecv and Cinit lines are the following:
    Cinit_Trace = /opt/tmix-1.2/data/1of10.cinit.tcvec
    Crecv_Trace = /opt/tmix-1.2/data/1of10.crecv.tcvec
    

on the "left" node, and

Cinit_Trace = /opt/tmix-1.2/data/1of10.crecv.tcvec
Crecv_Trace = /opt/tmix-1.2/data/1of10.cinit.tcvec

on the "right" node. Note that the filenames are swapped on the "right" node. All you should need to change is what is after "data/" and before ".cinit" or ".crecv". You are now ready to run the Tmix experiment.

  1. Tmix relies upon a pre-determined start time to synchronize Tmix on the two nodes. On both nodes, run the following command to determine the time and date:
    date
    

Note that the time may be in a different time zone. Decide on a start time about a minute or two in the future, relative to the time displayed by the date command. It should be far enough in the future for you to issue the following command on both nodes and allow Tmix to initialize.

  1. Finally, execute the following command on both nodes:
    tmix -s HH:MM:SS tmix.conf
    

where HH:MM:SS is the chosen start time in hours minutes and seconds.

  1. On both hosts Tmix will load the data files and then wait until the designated start time.

You will see "Running for 720 seconds" once Tmix is ready to go. When the start time arrives, Tmix will start running for 12 minutes (a 10 minute experiment plus 2 minutes of buffer). You will not see any indication on the terminal that it has started running.

  1. One way to verify that Tmix is running is to open another SSH terminal to either node and type:
    top
    

This shows a listing of the most active processes. You should see "tmix" at the top of the list once it starts. Top can also give you an indication of what percentage of the CPU Tmix is using (in the %CPU column). If you see numbers near 80% or 90%, you are near the capacity of the number of connections that this pair of nodes can simulate. Type 'q' to exit top.

  1. Once the experiment is complete, a list of statistics will be output to the console. Also, a set of log files with extensions .ert, .trt, .unc, .rt, and .ts will be created in the directory. While Tmix runs it is customary to experience a few errors where connections fail to open or close. It is also customary to see several errors at the end of the experiment indicating that some connections failed to close. If, however, you see a steady stream of errors, something is wrong or you are trying to simulate too much traffic. Type "Ctrl-c" on both terminals, verify your "tmix.conf" file, and ensure you followed the above directions.
  1. To determine the number of bytes transferred during your experiment, type the following on the "right" SSH terminal:
    bytesTxfrd tmixTutorial.rt
    

Plotting this number in relation to the number of connection vectors for several experiments using different portions of traffic will allow us to determine at what point we can no longer reliably emulate traffic with a single pair of nodes. Running this command on the "left" node will give us the number of bytes in the opposite direction. This is not what we want to plot, however, because fewer bytes are transferred in the opposite direction in the traffic we are replaying. We are are instead interested in plotting data for the link which places the most demand on system resources.

  1. To return to your home directory, type:
    cd ..
    

E. Iterate

Calibration Graph Repeat steps 1 through 10 of section D for input files or your choosing (i.e. 1of02.cinit.tcvec), and plot the resulting number of bytes transferred. Select input files in an attempt to obtain two or three data points which show a linear trend. Then select an input file that attempts to replay so much traffic that the resources are exausted, and the linear growth stops. The graph to the right is the resulting graph from calibrating two GENI nodes. The graph indicates that given the environment when the experiment was run, at least three pairs of this type of node would be required to reliably replay all the traffic. You can use the following table to record the calibration results for your nodes. Depending on the properties of the nodes you are using, the network traffic, and the resource utilization, your results will likely be different.

cvec file number of connections bytes transferred Notes ..................
1 of 01515,240
1 of 02257,620
1 of 03171,747
1 of 04128,810
1 of 05103,048
1 of 0685,874
1 of 0773,606
1 of 0864,405
1 of 0957,249
1 of 1051,524
1 of 1534,350
1 of 2025,762
1 of 2520,610
1 of 3017,175

F. Delete Resources

Once your experiment is complete and you have collected your results, you should return the reserved resources. To do so, follow the step below.

  1. In GENI portal, Click on slices in the upper right-hand corner.
  1. Find your slice in the list and click on the corresponding Delete Resources button.
  1. Click "Delete Resources" again to confirm that you want to delete all reserved resources.

Attachments (1)

Download all attachments as: .zip