wiki:GeniTmixCalibTutorial

Version 68 (modified by Ben Newton, 5 years ago) (diff)

--

Calibrating Experiments on GENI using the Tmix Traffic Generation Tool

This page describes how to run calibration experiments with Tmix on GENI nodes. This tutorial assumes that you are already familiar with basic experimentation with GENI, and that you have access to a machine from which you can execute ssh commands and login to external machines. If you wish to reserve resources using omni, some version of Linux or Unix running on a PC or virtual machine is required. The tutorial also assumes basic familiarity with Linux or Unix, and the ability to use a terminal text editor such as emacs, vim or nano.

Before starting it is important to understand what calibration is, and why it is important. At run-time Tmix "replays" the exchanges encoded in a set of connection vectors (or c-vecs). The connection vectors are extracted from traffic captured on a real network link. Since all of the traffic crossing a busy link cannot generally be replayed using a single pair of nodes, it is customary to split the connection vectors into N pairs of tcvec files. These can then be used on N pairs of machines to simulate the captured network traffic. If all of the traffic observed on a busy link cannot be simulated with just two nodes, what percentage of the traffic can we replay? How many pairs of nodes do we need to use to simulate all the traffic? It is exactly these questions that calibration seeks to answer. Below we will walk through the process of determining how much traffic can be simulated by a single pair of nodes.

A Reserve Resources on GENI Portal

Click the link below for instructions on generating a slice and reserving resources using the GENI Portal. For this tutorial we will use the “Tmix 10 min experiment” resource specification (rspec). If you are completing this tutorial at a conference you will add nodes from the aggregate you were assigned to use by the person in charge of the tutorial. Otherwise, you may use any of the instageni or ProtoGENI aggregates. The image and setup are the same as in the basic Tmix Tutorial.

Reserve Resources using GENI Portal

B Login to nodes

Two nodes have been reserved with the hostnames "left" and "right". Open two new SSH terminals, one to each node. On Linux you may use the following command to log in to both nodes reserved in the previous step.

ssh -i ~/.ssh/id_geni_ssh_rsa <username>@<hostname> -p <port>

As expected, <username> should be replaced with your GENI username, and <hostname> and <port> should be replaced with the hostname and port of the reserved GENI resource noted in the previous step.

C Run Tmix Script and view data files

  1. The image loaded on the nodes has the tmix tools already installed and in your path. Each time the system boots, a kernel module is automatically inserted to assist in simulating the packet delays. List the contents of your home directory on either node.
    ls
    
  2. If you do not see a tmix.conf file in your home directory type:
    /local/tmix-script.sh
    

This script generates a tmix.conf file and attempts to reload the kernel module. Ignore the warning that that effect.

  1. Repeat step 2 for the other node.
  1. Open the tmix.conf file with an editor such as vim, emacs, or nano, and browse its contents. The entries represent various settings that control a Tmix experiment. At the end of the file notice the Crecv_Trace and Cinit_Trace lines. These are the only two lines you will need to change as we run our calibration experiments.
  1. List the contents of the tmix data directory by typing the following on either node:
    ls /opt/tmix-1.2/data
    

A set of connection vectors is described by with a pair of tcvec files (labeled cinit.tcvec and crecv.tcvec). These two files correspond to connections that are initiated on either side of the link on which the traffic data was originally captured. Since all of the traffic crossing a busy link cannot generally be replayed using a single pair of nodes, it is customary to split the connection vectors into N pairs of tcvec files. Most of the files displayed in the listing are named 1ofN.crecv.tcvec. This is a single pair of connection vector files obtained by evenly splitting the original set of connection vectors into N parts. For example, running tmix with the pair of files, 1of10.crecv.tcvec and 1of10.cinit.tcvec, will replay about 1/10th the traffic originally recorded on the link. Below we will perform a set of experiments in which we iterate to find the point at which our pair of nodes can no longer work fast enough to simulate all the network traffic.

D Run Tmix Script and view data files

  1. Make a new directory for this experiment, change to it, and copy tmix.conf there by typing:
    mkdir 1of10
    cd 1of10
    cp ../tmix.conf .
    
  1. Repeat for the other node.
  1. Edit the copied tmix.conf file on both nodes using a terminal text editor such that your Crecv and Cinit lines are the following.
    Cinit_Trace = /opt/tmix-1.2/data/1of10.cinit.tcvec
    Crecv_Trace = /opt/tmix-1.2/data/1of10.crecv.tcvec
    

on the "left" node, and

Cinit_Trace = /opt/tmix-1.2/data/1of10.crecv.tcvec
Crecv_Trace = /opt/tmix-1.2/data/1of10.cinit.tcvec

on the "right" node. Note that the filenames are swapped on the "right" node. All you should need to change is what is after "data/" and before ".cinit" or ".crecv". You are now ready to run the tmix experiment.

  1. Tmix relies upon a pre-determined start time to synchronize tmix on the two nodes. On both nodes run the following command to determine the time and date:
    date
    

Note that the time may be in a different time zone. Decide on a start time about a minute or two in the future, relative to the time displayed by the date command. It should be far enough in the future for you to issue the following command on both nodes and allow tmix to initialize.

  1. Finally, execute the following command on both nodes:
    tmix -s HH:MM:SS tmix.conf
    

where HH:MM:SS is the chosen start time in hours minutes and seconds.

  1. Tmix on both hosts will load the data files and then wait until the designated start time.

You will see "Running for 720 seconds" once tmix is ready to go. When the start time arrives, Tmix will start running for 12 minutes (a 10 minute experiment plus 2 minutes of buffer). You will not see any indication on the terminal that it has started running.

  1. One way to verify that tmix is running is to open another SSH terminal to either node and type:
    top
    

This shows a listing of the most active processes. You should see tmix at the top of the list once it starts. Top can also give you an indication of what percentage of the CPU tmix is using ( in the %CPU column). If you see numbers near 80% or 90% you are near the capacity of the number of connections that this pair of nodes can simulate. To exit top type 'q'.

  1. Once the experiment is complete, a list of statistics will be output to the console. Also, a set of log files with extensions .ert, .trt, .unc, .rt, and .ts will be created in the directory. While tmix runs it is customary to experience a few errors, where connections fail to open or close. It is also customary to see several errors at the end of the experiment indicating that some connections failed to close. If, however, you see a steady stream of errors, something is wrong or you are trying to simulate too much traffic. Type "Ctrl-c" on both terminals, and verify your tmix.conf file, and ensure you followed the above directions.
  1. To determine the number of bytes transferred during your experiment type the following on the "right" SSH terminal:
    bytesTxfrd tmixTutorial.rt
    

Plotting this number in relation to the number of connection vectors for several experiments using different portions of traffic will allow us to determine at what point we can no longer reliably emulate traffic with a single pair of nodes. Running this command on the "left" node will give us the number of bytes in the opposite direction. This is not what we want to plot, however, because the bytes transferred in the opposite direction were less for the monitored link, and therefore are not the "weakest link", and will not be impacted as much when resources are saturated.

  1. To return to your home directory type:
    cd ..
    

E Iterate

Calibration Graph Repeat steps 1 through 10 of section D for various input files (i.e. 1of02.cinit.tcvec), and plot the resulting bytes transferred. You should reach a point where the linear growth stops. The graph below is the resulting graph from calibrating two GENI nodes. At least 3 pairs of nodes would be needed to reliably replay data on these nodes. You can use the following table to record your results.

cvec file number of connections bytes transferred Notes ..................
1 of 01515,240
1 of 02257,620
1 of 03171,747
1 of 04128,810
1 of 05103,048
1 of 0685,874
1 of 0773,606
1 of 0864,405
1 of 0957,249
1 of 1051,524
1 of 1534,350
1 of 2025,762
1 of 2520,610
1 of 3017,175

E Delete Resources

Once your experiment is complete and you have collected your results, you should return the reserved resources. To do so, follow the step below.

  1. In GENI portal, Click on slices in the upper right-hand corner.
  1. Find your slice in the list, and click on the corresponding Delete Resources button.
  1. Click "Delete Resources" again to confirm that you want to delete all reserved resources.

Attachments (1)

Download all attachments as: .zip