Changes between Initial Version and Version 1 of GENIRacksHome/InstageniUseCases


Ignore:
Timestamp:
03/08/12 13:02:57 (12 years ago)
Author:
lnevers@bbn.com
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GENIRacksHome/InstageniUseCases

    v1 v1  
     1[[PageOutline]]
     2
     3This is the list of InstaGENI Use Cases to be sent to instageni-design@geni.net. When information is exchanged on the list it will be captured here for each case. Also note that any document that is refereed to with a URL is attached to the page for historical reference.Following are notes attributions:
     4{{{
     5AH: Aaron Helsing     HD: Heidi Dempsey
     6}}}
     7
     8= InstaGENI Use Cases =
     9There are 9 InstaGENI Use Cases that are to be discussed in the InstaGENI design review. Note that the 7 experimenter use cases all assume that the experimenter is using the GENI AM API (e.g. with the OMNI client).
     10
     11== Experimenter Use Cases ==
     12
     13=== Use Case 1 ===
     14I want to reserve some resources within a single InstaGENI rack. "Resources" includes both compute and network resources in this use case and in the six that follow.
     15{{{
     16<AH> Will look like current PG. User uses Flack/Omni to PG CH for list of AMs. Get Ads from AMs.
     17     For single rack, use cert from the CH to talk to that AM to ask for resources. For this
     18     and all, there is now and when there is a GENI CH.
     19<HD> Use Flack or OMNI to ask clearinghouse for advertisements, would get a view or resources available
     20     on all racks.  If you want something on one rack, use cert from GENI clearinghouse to get resources.
     21}}}
     22
     23=== Use Case 2 === 
     24I want to reserve some resources in multiple InstaGENI racks, and have them all connected at layer 2.
     25
     26{{{
     27<AH> Looks like demo at GEC9? Client contacts a slice embedding service to find resources that fulfill
     28     this, get a marked up RSpec that tells you where you go - including maybe a backbone AM.
     29     You talk to all 3 to get the slice. They talk to each other to pick VLAN #s. Pairwise negotiations
     30     - chain mode. PG already running the slice embedding service at Utah.
     31     Rob: No VLAN translation on the racks - these switches do not do VLAN translation
     32     Rob: can also do stitching extension - just tell the AM which switch/touchpoint you need a VLAN on
     33     Rob: initially not all dynamic, yes - a set of static VLANs across campus. That will be the constraint
     34     in the AM pairwise negotiation. We'll have some small set of VLANs that go to the next point.
     35     Then we treat, say, everything from POP to campus, as being like a long wire with 20 VLAN #s
     36     we can use on it.
     37     Joe: 1 type dynamic is auto provisioning of VLAN. Another is you give a bundle and people
     38     switch from A to B
     39     Rick: Looking at first set, we can bridge via MAX to NLR. or go direct.
     40     Joe: Can get a set of static VLANs thru FrameNet
     41<HD> Looks like GEC9 demo.  Client contacts slice embedding service to ask where there are resources
     42     to match this.  They get back a marked up RSPEC that includes all resources needed (including backbone). 
     43     Have a very simple negotiation--chain mode. Do racks all do VLAN translation?  Rob doesn't think so. 
     44     These switches don't do VLAN translation.  When we're going through ION/Framenet or Starlight, those
     45     services do VLAN translation. 
     46     What we'll get on a lot of campuses is a set of static VLANs on a campus, which will constrain negotiation.
     47     VLANs may go to some particular point.  Point can chage for different cases.  Can have different kinds of
     48     dyanmic (autoVLAN vs. A vs B out of a bundle of ABCD). 
     49     10Gigs to Starlight via SOX.  DYNES will be supported at Starlight, along with many others.
     50}}}
     51=== Use Case 3 === 
     52I want to reserve some resources in one or more InstaGENI racks, AND one or more other resources that are connected to NLR or Internet2 at layer 2, and have them all connected to each other at layer 2. (These other resources might be GENI resources or might not be; let us know if this makes a difference.)
     53{{{
     54<AH> Rob: experimenter must specify -
     55<HD> This is covered by the same solutions as case 2.  Will run procurve 6600 in hybrid mode.  Turn on and off
     56     OF on individual VLANs.  Will run FV and FOAM in one VM on IG rack.  If someone doesn't ask for OF, they
     57     get a VLAN that doesn't have OF enabled.  If they just want OF control for a VLAN entirely in the rack,
     58     just tell them the port to point their controller at--don't need to go through the FOAM stuff.  Those
     59     ports don't have to be on a globally routable IP address.  Existing OF VLAN example, using a shared VLAN
     60     that may be shared with other slices--that is where FOAM and FV come in.  Will give Nick manifests with
     61     the slice-to-resource info, and he can get that into FOAM on the users behalf.  PG won't try to allocate
     62     flowspace, just VLANs.  Experimenter has to submit RSPEC to FOAM too after giving it to PG?  Yes.  This
     63     seems a better simpler approach than a PG proxy for FOAM.  It will really be the experimenter tool that
     64     is doing this two-step process.
     65     OF supports either the switch connects to you or you connect to the switch mode.  Rob likes the second
     66     better because it gives you more flexibility for what you want to do.  Especially good if you are running
     67     controller inside slice.
     68}}}
     69=== Use Case 4 ===
     70I want to reserve only some !OpenFlow resources in the InstaGENI rack, to connect some non-InstaGENI resources at a site (which are connected at layer 2 to the InstaGENI rack dataplane switch) to some non-InstaGENI resources at another site (via NLR or Internet2). (Aka "how do I use just the !OpenFlow switch to connect a site network to an upstream network without using any InstaGENI compute resources".)
     71{{{
     72<HD> Will handle the same way we've been handling SPP and ShadowNet.  Create a fake node at that
     73     rack that has some info about who can request this thing in their slice, does it have a VLAN
     74     tag, can more than one person connect to it at the same time etc.  Looks like connecting to a
     75     node from PGs point of view.  PG just sets up ports and VLANs and doesn't try to configure the
     76     node.  Are the fake nodes shared or exclusive?  Can be either--we do different ones for different
     77     cases. PG backbone now is exclusive, SPPs are non-exclusive.
     78}}}
     79=== Use Case 5 ===
     80Same scenario as (2) above, but I want the relevant network resources to be !OpenFlow-controlled.
     81{{{
     82<HD> 5 and 6 are both covered b the same mechanism.
     83}}}
     84=== Use Case 6 === 
     85Same scenario as (3) above, but I want the relevant network resources to be !OpenFlow-controlled.{{{
     86<HD> 5 and 6 are both covered b the same mechanism.
     87}}}
     88=== Use Case 7 ===
     89I want to use kernel-level Click on InstaGENI. How does an experimenter run an experimental configuration that incorporates a kernel-level Click router?
     90{{{
     91<HD> Covered by same mechanism if its an external resource.  Can't do kernal mods in either PG
     92     or PL containers.  PG has non-production level support for using XEN as a virtualization
     93     technology.  The bare metal node is still a PG node if they use it for the kernal mod resource.
     94}}}
     95
     9633. If I have two shared hosts (either through OpenVZ or with !PlanetLab) can I force my traffic to go through the !OpenFlow switch?
     97
     98== Operator Use Cases ==
     99=== Use Case 8 ===
     100An update is available for some part of the software/firmware in the rack; I want it to be installed. (Related question: How are currently-running slivers affected by updates to various components?)
     101{{{
     102<HD> For software on control node, PG does updates personally and Nick personally for FOAM.
     103     Site admins can do it if they want to take over all future updates.  Will take snapshots
     104     of VMs so that if something goes really wrong they can roll back (this would not apply
     105     to time periods like a week).  Updates in PG don't affect long-running slivers at all,
     106     so if PG does the IG ones, that won't be the case.  PG will also be responsible for
     107     updating firmware on switches.  Nick and Rob will have to coordinate on this.  Could
     108     affect running slivers of course.  Requires a planned maintenance window, with a week
     109     or more of notice.  PlanetLab nodes can update software without affecting slivers a lot
     110     of the time too.  Might interrupt when node needs to reboot to have it take effect. 
     111     OpenVZ updating process is not quite so nice says Rob.  OpenVZ puts a separate userland
     112     FS for each separate contrainer.  Want to do very rarely and only with warning.   What
     113     is plan for updating experimental images?  After PG tests images on racks in utah, will
     114     push out new versions to other racks.  If local admins add hardware to the racks, they
     115     will have to bear some of the responsibility for doing the image updates on  their added
     116     hardware too.  Will be making the current PG non-production way of pushing out image
     117     updates become production for experimenters.
     118     Is it possible to have multiple OSes on a single shared node.  For OpenVZ there's a
     119     separate userland for each container.  Everybody shares the same kernal (probably Ubuntu
     120     and CentOS).  This already works on Utah Emulab.  Rob will try to send some more info
     121     on this.  PlanetLab--each slice has its own separate FS, and can load different OSes
     122     into that FS.  Doing Fedora 8-14 builds now.  Could do others in theory but aren't
     123     currently doing that now.
     124     Are there limits on the HP switch about how many OF controlled VLANs you can have? 
     125     Think there is a limit that is around a dozen.  Will PG throttle this?  Rob says they
     126     could--not sure if that is needed.
     127     When a rack fetches a new image from a central repos, will send out mail to the admin
     128     list, which lets local admins know when something new needs to be looked at for example
     129     for security reasons.
     130}}}
     131=== Use Case 9 ===
     132Something in the rack is misbehaving; I want to identify which  resources are causing the symptoms, which sliver those resources are currently allocated to, which slice that sliver is a part of, and who the owner of that slice is. (Corollary: I receive a report of past misbehavior from a resource; I want to identify the sliver/slice/owner who had that resource at the time.).
     133{{{
     134<HD> PlanetFlow like interface.  Will be public.  On the more detailed level of a full manifest,
     135     will need access via an account.  If the identifier isn't an IP adress it gets harder to do
     136     anything via this kind of tool.  VLAN numbers is probably  the kind of thing you are next most
     137     likely to want.  That question should be an admin only interface (includes GMOC).  May be a
     138     case that Admins from other racks should be able to ask it too.  Rob will have to think about
     139     that.  Seems like it is likely a good idea.
     140     Rob says they keep the who had what VLAN data around for ever.  It's pretty small.  He just
     141     published a paper about it covering 20 years.
     142     Chaos says what happens if a switch gets wedged with ILOs.  There is a PDU in the rack--can
     143     it be used to reboot a switch in the rack?  If the top of rack switch gets wedged, the POC
     144     has to reboot it somehow because everything goes through there.  If the 6600 gets wedged,
     145     it would be good to use the serial port to reboot it.  Rob will look at this.  Thinks
     146     ProCurves are pretty good about rebooting as long as you can get to the console.  Rick
     147     says there are serial ports will wire up.
     148     Joe Mambretti.  What are interop expectations of GPO?  Expect full interop for OpenFlow. 
     149     Can be restrictions for some types of connections.  If they are willign to do VLAN connections,
     150     it will be a nice-to-have, not essential.  Starlight has a nice ORCA implementation that
     151     We'll follow up with some more email questions, but it looks like we have reasonable answers
     152     to all major questions and use cases.
     153}}}
     154