Tmix on ProtoGENI

Information about Running experiments using the Tmix traffic generation system on ProtoGENI nodes.

What is Tmix?

In order to perform realistic network simulations, one needs a traffic generator that is capable of generating realistic synthetic traffic in a closed-loop fashion that "looks like" traffic found on an actual network.

The Tmix system takes as input a packet header trace file captured from a network link of interest (such as the link between the UNC campus and the rest of the internet). This trace is reverse-compiled into a connection vector (or cvec) file, which is a source-level characterization of each TCP connection present in the trace. Tmix then uses this information to emulate the socket-level behavior of the source application that created the corresponding connection in the trace. The resulting traffic generation is statistically representative of the traffic measured on the real link.

Traffic Generation

One of the most complex components of empirical evaluations is modeling and generating realistic Internet traffic. The mix of the ever changing and varied applications that constitute the actual Internet traffic makes this a daunting task. Moreover, Internet traffic is different when sampled at different times and in different parts of the globe. Networking researchers have grappled with this problem by taking snapshots of Internet traffic at different times and at various points in the network, and modeling the same for generating traffic in the lab. The generally held 3 belief is that the more realistic the traffic used, the more reliable are the results of the empirical evaluations using that traffic. Practice, however, does not adhere to this principle. So, although laboratory testbeds and methods for simulations have evolved over the years, the question about what constitutes essential components for modeling realistic traffic remains open for debate. For example, networking researchers agree that realistic traffic generation for empirical research is best accomplished by capturing traffic on a production link and then using source-level models to generate this traffic in the laboratory or simulator. Source-level models capture the application exchanges and application behavior on the ends (sources) of the TCP connections. But how do you go from the original captured traffic to an acceptable source-level model? Which of the several measures derived from the traffic sources should you model in your workload for your experiments? Would your modeling choices for traffic generation impact the outcome of your experiments? If yes, how significant would the impact be? These remain open questions.


Sample Experiments

