wiki:NetworkRspecMiniWorkshopNotes

Version 5 (modified by Aaron Falk, 15 years ago) (diff)

--

See http://groups.geni.net/geni/wiki/NetworkRSpecMiniWorkshop for sides

Intro Aaron Falk

High level motivation goals for spiral 1. See Aaron Falk's slides. Demonstrate end-to-end slices across representative samples of the major substrates and technologies envisioned in GENI. Goal for each cluster is to demonstrate end-to-end via your control framework.

This is what the GPO is paying you to do, this is what we want demonstrated at the end of spiral 1

John Turner What is an "end"?

Loosely defined, but think from perspective of experimenter.

Larry Peterson Key bar is two or more aggregates sharing a packet?

No, aggregates, not in terms of apckets

James Sterbenz Not single slice across all of them in year 1, pairwise is sufficient

Only pairwise would be a dissapointment. Really want to show that it's possible to use multiple aggregates, more than two. Minimal is two end nodes and two aggregates, but that's really the absolute minimum.

For each cluster,

  • How does a network device or aggregate reserve resources?
  • How to network slivers join to form an end-to-end slice?

Nobody has articulated yet how they are going to do this.

In each cluster, what are the plans to support this, in each cluster, in spiral 1?

John Turner When you talk about an end-to-end slice, do you have a prescription for what the data path looks like?

You're in cluster B, a candidate slice would include Internet2 VLANs connecting spp nodes and GpENI and Stanford.

John Turner Sounds like Internet2 VLANs on an spp somehow ending up at stanford

GENI will always have a hodge-podge of connectivity. For now we'll have a set of preconfigured fixed VLANs in internet 2.

John Turner VLANS in Internet2 are being provided by Rob Ricci, not Internet2. I don't want to pick on Stanford per se, but it's not clear how we're going to make any connection happen.

James Sterbenz Also what sort of experiments will tie all of these things together, given the disparate technologies

Larry Peterson this is what will come out of discussions today

What capabilities will we have from backbones in spiral 1?

Camilo Viecco are any clusters also using NLR?

Guido Appenzeller there is potentially more than one Internet2 backbone

Rob Ricci the ones we're handing out today are between individual sites. Internet2 is providing a 10Gb wavelength and we're putting ethernet switches on top of that wave. We'll be making VLANs on that wave without any Internet2 involvement.

Ivan Seskar also an issue when particular locations will be connected to the wave

There is a general problem here. Let's not get lost in the weeds.

Larry Peterson Simple observation. As diverse as these technologies are, we have IP addresses for all of them. We can fall back to using IP tunnels) for everything.

End users can access GENI experimentation this way. But we have set as a goal non-IP, not layered over IP connectivity for GENI.

Rob Ricci Maybe we should do with GpENI -- get a fiber from the local Internet2 POP to our campus.

John Turner From my experience the Internet2 folks will push back not make this easy

Yes, we've got some things to work out here. However, to be concrete, those who have direct connectivity in sprial 1 will be expected to demonstrate the ability to stitch together VLANs.

Ivan Seskar Can GPO be more involved with these discussions with Internet2 / NLR?

Yes, we have full time staff who would be happy to help work things out with Internet2, NLR, etc.

Ilya Baldine can we organize and get some shared stimulus money to wire up campuses?

Chip Elliott My understanding is that every campus is supposed to make a single proposal to NSF.

Network Cofiguration Use Case Aaron Falk

Sliver creation. First makes reservations of stuff around the edge, but now needs to interconnect aggregates. (Assumption is that physical connectivity between these aggregates exists.) Then researcher passes rspec requesting VLANs between aggregates, then asks for the topology to be set up.

  • do we need a standard method to describe these network coordinates, or are

they just blobs?

  • does it go into the rspec?
  • are there now constraints on the order in which networks can be added to a

slice?

  • how does it work with multiple networks in a series?
  • how are ordering costraints handled in the control framework?
  • how will tunnels work?

This is the discussion for this afternoon. How do we describe resources is this morning.

John Turner let's put this in as concrete terms as possible, I have a very difficult time connecting your abstract diagrams with my cluster or any other cluster.

We need to figure out what people need to do to support this by the fall. This is an engineering, not research, discussion. Different answers from different clusters is OK, different answers from single cluster smells funny; you need to explain how it'll work. Throwing code over the transom probably won't cut it. Needs to be collective cluster ownership of this goal. Entire cluster is going to be evaluated on getting this to work.

Enterprise GENI view Rob Sherwood

FlowVisor mosly feature complete, publicly released.

Aggregate manager: resoruce discovery, reports to CH as rspec, accepts reservations, converts rspec to flowvisor config.

Clearinghouse: implemented toy version for testing.

E-GENI rspec: switches, interfaces, "flowspace", opt-in, inter-aggregate connectivity

Chip Elliott have you thought about measurement yet?

Built into openflow -- byte and packet counters. With a controller you can redirect flows through a measurement box.

Guido Appenzeller We haven't thought it through in full detail, but you get a fair amount of control from OpenFlow, can look deep into a packet.

We don't really have nodes in a traditional sense, have a datapath ID (i.e. MAC addr off switch), list of interfaces. We don't "log into" a switch.

Guido Appenzeller as soon as you reserve a switch, the switch connects back to the URL of your controller and the switch starts asking your controller for instructions.

Camilo Viecco Do you have one user at a time, or multiple users?

You have one user at a time. Default rule is if we don't have a rule for a packet, a message gets sent to the controller.

Guido Appenzeller If you connect to something that is not part of your aggregate, it's represented differently. This describes internal references.

Can think of FlowSpace as header, "field=value" pairs plus actions. Packet classifier built-in. Header fields (ip_src, ip_dst, ethertype, etc.). Actions are allow, deny, listen-only.

Ex: all web traffic, except to main server:

    ip_src = 1.2.3.4 tcp_dport=80 :: DENY
    ip_src=1.2.3/24 tcp_dport=80 :: ALLOW

Guido Appenzeller can say "ipv6 goes to this controller, ipv4 goes to that controller."

Rspec - opt-in. How do we express what users experimenters want to allow in? "All", "first 10", "only port 80 on switch 3".

How do we do this between slivers?

Use case: "gibv me our PlanetLab nodes and the E-GENI network that connects them." Need to know how to communicate that off of this switch, off of this node, is a point of attachment.

Aaron Falk if you've got multiple slivers on a single planetlab node, how do you assign them to an egeni node? what does planetlab demultiplex on?

Larry Peterson tcp ports. we've been lazy in how you lock down ports, you claim a port on a wiki.

Aaron Falk There is a bootstrapping problem wth planetlab and E-GENI. We need to figure this out.

Chip Elliott do you have both openflow and planetlab nodes in the same room?

I do not.

Ivan Seskar, Nick Feamster) have both planetlab nodes and openflow switches, but they are not connected

Larry Peterson we could have a global allocation of ports, tunnel numbers, etc., if we just have a global list.

Guido Appenzeller we want a dynamic mapping to slices

Ted Faber If yo're going to define slices in the rspec, have to use globally understood parts of the flowspec, internally to the aggregate switches may need to be topoology aware

ProtoGENI CF View Rob Ricci

Working prototype rspec

Supports nodes, interfaces, links.

Used to allocate slivers -- raw PCs, vms, VLANs, tunnels. Expressed in XML. Tunnels are cross-aggregate. Slice Emnbedding Servive that understands it.

Under development: extensions using NVDL, cross-aggregate RSpecs.

In our view of the lifecycle of an rspec, we view it as progressive annotation. User creates request (bound or unbound), passes to a Slice Embedding Service, annotates with physical resources selected, maybe more than one.

Gives to CM, CM signs (generates ticket), Manifest returned by CM, adds details like access method, MACs, etc.

Four types, similar but not identical.

Advertisement, catalog (published by component manager)

Request, constructed by user (purchase order)

Ticket, receipt (signed, type of credential)

Manifest, packing slip (returned by CreateSliver())

Model we have now is that an individual component manager will accept or reject your request. This needs to be expanded, if it can only handle some of your request (e.g. 99 out of 100 requests).

We could make it more complicated, not sure what the right thing to do is.

Discussion of how to do what is essentially the travel agent problem.

Looking at the rspec as a mapping between the requested sliver and the physical resources.

Aaron Falk What does nick need to do with the bgpmux to use this?

We're always adding information, never removing information advertisements have component IDs, requests have virtual IDs, a bound request has both, creating a mapping. Identifiers are URNs (GMOC proposal)

A sliver uniquely identiied by (slice ID, virtual ID, CM ID)

Aaron Falk If what I'm advertising is a collection of stuff, what do I advertise?

If you don't want to show me the details of your network, it's is not our design center.

Aaron Falk but Internet2 won't run an AM, won't identify all the optical switches along the path

We'll advertise "here's an enet switch, here's another enet switch", and won't say anything about the topology beneath it, since it's dynamic and out of our control.

If you care whether you go across shared trunk links, etc., you can ask for that. The slice embedding service can do this to minimize cross-trunk latency, etc. To connect to Rob's talk, if openflow gives me an identifier that we need to pass back, it goes into the Manifest, any virtual identifier is my identifier (well, it has to be a URN).

Coordination problems: both ends may need to share information, e.g. tunnel endpoints. Ordering/timing may be important. Negotiation may be necessary (e.g. session key establishment). Some are transitive problems (e.g. VLAN #s, unless translation is possible), Assumption is that cross-aggregate

Nick Feamster VINI has rspec to create tunnels between virtual nodes, but need one to connect VINI to mux, nether VINI or ProtoGeni

Nick Feamster is there one rspec that's going to say "I need a virtual node that is a tunnel to this mux"?

laughter

this is all typed, types are well-known device classes (e.g. openflow enabled ethernet)

this grows out of stuff we do for emulab. We have a node type "PlanetLab", its links are type "ipv4".

Guido Appenzeller You're assuming these connections are always layer 2? What if it's something else?

(Ted Faber) type might be "runs a routing protocol"

We describe at the lowest level - e.g. ethernet, not ipv4, or tcp. You need to know you can run ipv4 over ethernet.

Links can cross aggregate boundaries -- nodes may not.

Ivan Seskar Said "node cannot cross aggregate" -- but this is common in wireless, e.g. wifi and wimax.

Aaron Falk Ah, the node will be in one aggregate, will have two different kinds of links out (via different carriers, etc.)

Disconnect -- some people think in terms of nodes, some think in terms of links.

Coordination across aggregates: design space

  1. client negoitates with each CM, rspec is the medium.
  1. cms coordinate among themselves, using a new standardized

control plane API. Rspec could be the medium.

  1. Untrusted intermediaty negotiates for client, intermediate has no privs

that client doesn't have. Rspec could be the medium.

  1. Trusted intermediary negotiates for client, pre-established trust between

intermediary and CMs. Rspec could be the medium.

Aaron Falk Would Nick's dynamic tunnel server be an example of (b)?

Yes

Doing (a) and (b), going for hybrid of (b) and (d). plan, not done yet. CMs negotiate two-party arrangements directly, e.g. tunnels. Trusted intermediary negotiates multi-party (VLANs, trusted authority picks VLAN numbenr, client is oblivious, only CMs talk to intermediary, negotiation information held by CM.

Aaron Falk is this consistent with DRAGON approach?

John Tracy yes, generally -- I've got info in my slides.

PlanetLab: Resource Specifications / End-to-End Slices Larry Peterson

This is kind of high level.

I'm going to argue we have a bunch of nodes. It might be the case that some of these nodes are special -- e.g., underneath them they have a layer 2 technology they want to take advantage of.

Some of these nodes are going to be part of other aggregates that have special capabilities, e.g. OpenFlow aggregate nodes, VINI aggregate nodes.

Christopher Small GPO so a node is a member of an aggregate for each kind of network it is on?

A node is controlled by only one aggregate at a time.

My definition of a node is something I can dump code into. "Clouds" export an aggregate manager interface. I can say "set up a circuit between node A and node B".

VINI is a cloud of nodes. Enterprise GENI is a cloud.

The reason I want to look at the world this way, is that a bunch of nodes already have a functioning interconnect, the internet. The assumption that I need across two aggregate boundaries is that I have shared some dmux key across the boundary, so there's a global allocation of these keys.

The world is a whole lot simpler if everyone is reachable via a shared id space -- it can be ip addrs or something else.

We already assumed that everything was reachable via some mechanism in the control plane. I think we should do the same thing for the data plane to make this all work more easily.

Aaron Falk You're assuming that there is one of these things between each pair -- what if you have to go across three links?

It's complicating life a lot (for the researcher) to have to deal with all pairwise layer 2 possibilities. Give me some guarantees about latency, failure, bandwidth, independent of what layers of encapsulation I do, is the key to what the researcher wants to do.

I'm not removing the capability of working with different kinds of links, just abstracting it away.

Ted Faber Can you view this without having IP tunneling?

I view connecting via something lower than IP that it's enough more difficult that it's an optimization. I can run GRE tunneling over layer 2 as easily as at IP.

I'm questioning the value to the research community to connect at layer 2. Jennifer (who is interested in IP networks) is happy with this.

We have built VINI and we have built PlanetLab. Nobody's coming to VINI to use layer 2 circuits. They use VINI for the guarantees, not layer 2-level hacking.

Rob Ricci I have a theory that people aren't using VINI for this is because they're using emulab. A significant minority do experiments below IP. Playing around with ethernet bridging, alternatives to IP, ... Still a minority, but we have them.

There are a couple of reasons people aren't using VINI in large numbers, but there are an awful lot who are most interested in predictable link behavior.

Ivan Seskar Most of the orbit experimenters don't care about IP at all. But that's the edge.

That's a good point. I'd view ORBIT as another cloud (another aggregate).

Ted Faber If you try to tunnel a MAC over IP, well, it doesn't work very well. Once you go up to layer 3, you've disrupted layer 2 sufficiently that you can't necessarily run the experiments you want to run.

As a consequence of this, there may be aggregate-specific experiments. (Strong implication that this is not the common case.)

Two separable issues: interface negotiation (what kind of resources are available between aggregates), resource negotiation (which resources can I have)

Have WSDL version of the interface (program-heavy), also XSD version (data heavy)

Backing off of pushing massive nested rspec on you.

Adopted simple model:

	RSpec = GetResources()

returns list of all resources available

	SetResources(RSpec)

acquire all resources

Only way today is

     while (true)
       if SetResources(RSpec)
         break

Doesn't neessarily terminate.

Aggregate returns capacity (what it will say yes to in XSD) and policy (how to interpret the capacity in XSLT). P(Request, Capacity) -> True means request will be honored. P(Request, Capacity) -> False means request will not be honored. Examples:

VINI today

     P(R, C) -> true if R and C are the same graph

VINI tomorrow

     P(R, C) -> true if R is a subset of C

PlanetLab today

     P(R, C) -> true if R is a subset of C and site sliver count OK

Nick Feamster Is there a notion of time in an rspec?

Yes

Discussion of using pyton vs XSLT for this.

Aaron Falk We're off track, you've gotten us off into rspec reconcilliation.

ORCA, BEN, NDL-OWL Ilya Baldine

Ilya Baldine: experimenting with ontologies for multi-layer network slicing

Need a way to describe what we have (substrate), what we want (request), what we are given (slice spec). Need to map resources, configure resources, and know what to measure.

Problem is that we have many organically grown solutions that kind of work. Need a functional model utilizing formalized techniques that fully describe the context of an experiment.

Multi-layered networks, not a single graph, embedding of graphs of higher-level networks into graphs of lower-level networks.

we aren't the first to face this: netowkr marjup language working group (NML-WG). Participants include Internet2, ESnet (PerfSONAR model) Dante/ GN2. University Amsterdam (NDL)

NDL. Based on OWL/RDF, in use within GLIF. Can be used for RDF frameworks. SPARQL supported. Based on G.805 (Generic function arch of transport networks).

Needs to be computer-readable network description. Human-readable is good, but computer-readable is critical. Describe state of multi-layer network.

What else do we need? Ability to describe requests (fuzzy), ability to describe specifications (precise).

Looked at some other options, this one seemed like the best option. It's a large search space.

NDL-OWL extends NDL into OWL. Richer semantics. BEN RDF describes BEN substrate. Developed a number of modules to assist in using it.

Have forked from original University of Amsterdam NDL; OWL has evolved, wanted to use better technology, better tools.

Goals -- more description languages, meaurement, cloud, wireless, etc.

Ted Faber My concern is that it seems very detailed. More detail than we need?

Ivan Seskar Example: give me a linear topology of nodes

Aaron Falk Assume there are tools that translate high-level descr into this.

I don't think people will point and click their way to this.

Chip Elliott Glyph could be federated into GENI and vice versa

We're working on it.

MAX/DRAGON Chris Tracy

SOAP-based GENI aggregate manager.

End-to-end slices

Over last few months have build aggregate manager in Java, runs in Tomcat as an apache service, uses WSDL (web services API).

Larry Peterson We have a SOAP interface now, too, should be able to interoperate.

On the back side talks to DRAGON-specific controller via SOAP. Or can go to a PlanetLab controller.

Chip Elliott is OpenFlow currently using same or different SOAP interface?

Larry Peterson It's a subset, we need to have the discussion and get them in sync

We've mostly tried to stick with what was in the slice facility architecture document. Been thinking of standing up a clearinghouse, but haven't done it yet.

We're using this to control any component at MAX, planetlab nodes, DRAGON, Equcalyptus, PASTA wireless, NetFPGA-based OpenFlow switches.

Putting NetFPGA cards in a machine, putting them out on the net somewhere.

We want this aggregate manager to be able ot manage anything on the net.

Aaron Falk This aggregate manager box isn't just a bunch of functions, doing some work to make sure things are allocated in a coherent manner

You can go to this AM and run "list capabilities" or "get nodes" and pass in a filter spec (give me all the nodes that can do both dragon and planetlab). Returns a controller URL so you can go talk to the controller for more info.

Code is published on the website, instances will be site-specific (aggregate specific).

Wrote WSDL file by hand based on SFA. wsdl2java generated Java skeleton code.

(Opens http://geni.dragon.maxgigapop.net:8080/axis2/services/AggregateGENI?wsdl demo using a generic SOAP Client)

Chip Elliott Nick is this the AM you should be using?

Rob Ricci in our case we haven't described our interface as a WSDL

Rob Sherwood there are a lot of WSDL tools you can use

The code for this (svn repo) is pointed to in the slides (p. 11?)

We think the clearing house will handle ticketing. (Open issue of which things are in the AM, which are in the CH.)

We believe end-to-end slices will look like what we're already doing for interdomain circuit reservations for DRAGON, Internet2 DCN, ESnet, etc. We think it will look like our Path Computation Engine (PCE), but will be more like a Resource Computation Engine.

We use NM-WG control plane schema. Domains, nodes, ports, links.

Domain -- group of nodes.

Nodes: end systems, switches.

Ports: on each node.

Links: this is where we describe the switching capabilities of a link (VLAN ranges, etc)

It's point to point only -- not broadcast. No support for multipoint VLANs.

<< switched slide decks >>

Assumption that at a domain boundary we only support VLANs. Restricted to layer 2, not cross-layer allocation.

Chip Elliott the GPO architecture would have a central clearinghouse, messages going up and down; in this the messages go across.

(OGF26 presentation NDL working group, Multilayer NDL presentation by Freek Dijkstra -- great explanation of NDL.)

This is GMPLS inspired, done with signaling via web services.

Yufeng Xin Who issues cross boundary configurations?

Once there's agreement over which VLAN we're terminating on, each aggregate will do it.

Chip Elliott how baked is this? Used 10 times a day?

Hundreds of times a month. Solid. Pretty much always works.

Planning Aaron Falk

What will each cluster be doing to reach the goal of cross-aggregate slices by the end of spiral 1? What are the inter-project dependencies?

Larry Peterson you're forcing us into realtime project negotiation here. I think we ought to do as much as we can assuming IP as the interconnect. It works for some, maybe not all -- GpENI? SPP boxes?

John Turner users can log into each and allocate by hand. No more explicit coordination is required to make it work.

What's needed to get there from here?

James Sterbenz From our perspective stitching together with IP works for now, but long term for GENI to succeed need to support more.

Is doing this by hand workable? Does this work for everyone?

Chris Tracy We can provision nodes via DRAGON

John Turner For both GpENI and DRAGON we can terminate a connection that we have VLANs defined on. There is a non-trivial amount of control software that we need to write but we have other things to do first, like getting systems deployed.

Goal is constructing end-to-end slices, not having it done automagically.

Guido, it sounds like you've got a little more work to do to connect the stanford campus.

Guido Appenzeller I think IP is a good common denominator for connecting aggregates for now. If we want to scale this to hundreds of aggregates this won't work.

Proposal to Cluster B: draw a picture of this, show where things interconnect and at what layer, where there are tunnels, where there are lower layer connections.

John Turner Did a version of this for GEC3.

Yes, but we need this for the cluster. Nobody has put onto a single sheet of paper all the things that need to be done to do this.

John Turner We all connect to the outside world via Internet2.

We have our own wave on Internet2. There are not routers on it. Want to draw a distinction between our access-to-the-outside-world and the GENI backbone.

Goal is to demonstrate end-to-end slices across a range of technologies.

Chris Tracy Are you in DC yet?

Rob Ricci There will be within a small number of weeks.

Action: Chris Tracy will get into the Internet2 cage, and will pull a cable between DRAGON and SPP (?).

Internet2 has told us we may be able to get access to get people from DCN to the GENI wave.

Rob Sherwood What are the right interfaces? We are in LA, Houston, and NY, are adding DC.

Chris Tracy How is GpENI going to connect to Internet2 DCN or the GENI wave?

James Sterbenz to maxgigapop, our equipment is in the Internet2 pop in Kansas City.

Chip Elliott How much will this cost?

James Sterbenz Will take action item to find out how much this will cost.

Rob Ricci You need to determine this quickly, we have a switch going in in the next couple of weeks. We need to talk.

Action: Rob, James, Heidi coordinate on Kansas City Internet2 connection

Action: Aaron Falk will send email to Rob and James to make sure they can contact each other via email. (AF: Closed 6/25/09. Rob and James have connected via email.)

MAX connects to NLR.

Action: Guido Appenzeller, John Turner will write a one-page high level list of the actions one needs to take to configure a slice.

Rob Ricci We're already doing this, more or less. It's done with tunnels. Once we get set up in Internet2 POPs. Plan outlined on his last slide shows how to get VLANs from campus to campus.

Rob Ricci Kentucky, CMU are already in. UML PEN shouldn't be too hard.

The picture will be very helpful

Rob Ricci It was on my poster at the last GEC.

Action: Rob Ricci will pull together the picture.

Cluster D, the impression that I've got is that you're all pretty integrated with a common control framework. Do you understand how to connect UMass down to BEN?

Ilya Baldine Technically we know some of the problems are. GPO committed us to being an NLR based cluster. UMass is working to get sone VLANs, but they are a limited resource. We are trying to figure out how to get to Charlotte Internet2 terminates there) from RTP. Maybe MPLS or VLAN from RENCI BEN POP to Intenet2.

Kansei is not an Internet2 campus.

It's important to make sure that we don't overwhelm Internet2 with requests, go through the GPO (Heidi).

Let's get this picture so we can figure out where the gaps are.

Ilya Baldine My main problem is that there will be costs associated with connecting us to Internet2. We don't know how much.

Harry Mussman Does it make sense to draw a picture of BEN, NLR, and MAX?

Ilya Baldine I'd like to do this, let's talk about this.

Action: Harry and Ilya Baldine will talk about this.

Cluster E?

Ivan Seskar Hey, we're done. We're on the same campus. Except for the air gap; we need to get someone to pull a cable up six floors from where Internet2 terminates and where we are.

Cluster A?

Ted Faber We're trying to do some relevant end-to-end work. Hook up to the DCN, plugged into a DETERLab node on one end, ISI East on the other. Working on the expanded authorization work we have talked about at the last couple of GECs.