Context Navigation

Changes between Version 4 and Version 5 of WorryingSlivers

Timestamp:: 12/07/11 23:59:18 (12 years ago)
Author:: chase@cs.duke.edu
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

WorryingSlivers

-                      v4
+                      v5
 == Semantic Models ==
 So: there are many kinds of virtual resources, and the goal of GENI is to operate on them.  To operate on them we must describe them.  But there are so many kinds to describe.  And as Alice said, the question is whether you *can* make words mean so many different things.  But what do we want to say about these virtual resources?
+So: there are many kinds of virtual resources, and the goal of GENI is to operate on them.  To operate on them we must describe them.  But there are so many kinds to describe.  And as [http://www.fecundity.com/pmagnus/humpty.html Alice] said, the question is whether you *can* make words mean so many different things.  But what do we want to say about these virtual resources?
 When we talk about virtual resources, we can often identify distinct elements within them to talk about.   For example, there are virtual machines, and network pipes, and logical storage containers.    These examples suggest that our virtual resources tend to closely match the shapes and behaviors of actual substrate elements (components).   Indeed, a physical component might be allocated directly as a virtual resource (as in early Emulab).   But virtual resources are above the component layer: a component might host more than one virtual resource, or a virtual resource might span components.  And when we describe virtual resources we find other elements that do not map easily onto components.  There are VLAN tags and other labels allocated from namespaces, and running programs, and standing queries such as active OpenFlow rulesets.
+When we talk about virtual resources, we can often identify distinct elements within them to talk about.   For example, there are virtual machines, and network pipes, and logical storage containers.    These examples suggest that our virtual resources tend to closely match the shapes and behaviors of actual substrate elements (components).   Indeed, a physical component might be allocated directly as a virtual resource (as in early Emulab).   But virtual resources are above the component layer: a component might host more than one virtual resource, or a virtual resource might span components.  And when we describe virtual resources we find other elements that do not map easily onto components.  There are VLAN tags and other flowspace labels, and running programs, and standing queries such as active OpenFlow rulesets.
 We can describe these various entities independently in terms of their properties and their relationships.  That is good, because if our goal is to describe virtual resources precisely enough to process the descriptions automatically, then we are going to need a semantic model, and semantic models are almost by definition based on classifying entities and their relationships.   There is a rich literature on entity-relationship models going back 35 years.
 Indeed, a large part of the GENI challenge is in developing and processing declarative specifications of virtual resources using semantic models.
+Indeed, a large part of the GENI challenge is in developing and processing [GeniRspec declarative specifications of virtual resources] using semantic models.
 We can use these models to create documents that describe virtual resources in terms of elements and relationships.   When we request virtual resources or changes to virtual resources, we attach documents that describe the resources and changes we want.  If a request is granted, we receive documents describing the virtual resources we got.   These documents are called rspec.
 What is important here is that virtual resources have an internal structure and properties, and we describe these using declarative specifications.
 Originally we used simple resource types and property lists in ORCA.  But these descriptions have become significantly more advanced in the GENI project, and they are a key part of the GENI architecture challenge.
+Originally ORCA used simple resource types and property lists.  But these descriptions have become significantly more advanced in the GENI project, and they are a key part of the GENI architecture challenge.
 But what does such an rspec document describe?  Must it describe a complete slice?  Or can we describe different pieces of a slice in different documents?  As we will see, this is an essential question for understanding the role of slivers in GENI.
 …
 == Partitioning Virtual Resources Across Aggregates ==
 Another aspect of the GENI challenge is that virtual resources are distributed.  They span a "Federated International Infrastructure" in the words of my favorite GENI Vision slide.  We recognize that substrate resources are grouped into aggregates owned by infrastructure providers.    In general, we seem to be willing to presume that this grouping is a partitioning: each piece of infrastructure is controlled by exactly one aggregate.   I sometimes hear people talk about hierarchical "aggregates of aggregates", but I think even they would agree that each piece of infrastructure is controlled by exactly one leaf aggregate in the hierarchy.
+Another aspect of the GENI challenge is that virtual resources are distributed.  They span a "Federated International Infrastructure" in the words of my favorite [attachment:wiki:WorryingSlivers:geni-vision.pdf GENI Vision slide].  We recognize that substrate resources are grouped into aggregates owned by infrastructure providers.    In general, we seem to be willing to presume that this grouping is a partitioning: each piece of infrastructure is controlled by exactly one aggregate.   I sometimes hear people talk about hierarchical "aggregates of aggregates", but I think even they would agree that each piece of infrastructure is controlled by exactly one leaf aggregate in the hierarchy.
 We also seem to accept that we can partition virtual resources across aggregates in a way that mirrors the partitioning of the substrate resources.   That is, a virtual resource is provided by a single aggregate, and consumes substrate resources only on that aggregate.   To make changes to a virtual resource, we send requests about it to the aggregate that controls it.   Users and their tools can talk to aggregates independently of other aggregates.
 …
 The next question is, when we get a part from a supplier, do we need to tell the supplier about our parts from other suppliers?
 When we talk to an aggregate about our virtual resources there, it is only reasonable that we would want to limit our communication to the specific infrastructure services that aggregate provides.    Similarly, if a builder or manufacturer gets parts from a supplier, they do not have to show the supplier the blueprints for the entire project.   It is understood that various materials and parts are available to the customer from different suppliers, and that these pieces fit together in various ways.   The customer may select the parts and combinations and use them to build whatever the customer wants, without telling the parts suppliers about the overall assembly.  The materials and parts and means of assembling them may change with time, and we can't say in advance what they all are.  But it is understood that it improves efficiency to have interchangeable off-the-shelf parts with standard well-defined compatibilities.   This familiar idea was called
+When we talk to an aggregate about our virtual resources there, it is reasonable that we would want to limit the conversation to the specific infrastructure service that aggregate provides.    Similarly, if a builder or manufacturer gets parts from a supplier, they do not have to show the supplier the blueprints for the entire project.   It is understood that various materials and parts are available to the customer from different suppliers, and that these pieces fit together in various ways.   The customer may select the parts and combinations and use them to build whatever the customer wants, without telling the parts suppliers about the overall assembly.  The materials and parts and means of assembling them may change with time, and we can't say in advance what they all are.  But it is understood that it improves efficiency to have interchangeable off-the-shelf parts with standard well-defined compatibilities.   This familiar idea was called
 [http://www.digitalhistory.uh.edu/database/article_display.cfm?HHID=604 the American System] in the 1830s.
 …
 == Stitching ==
 The key challenge to overcome is that there are relationships among virtual resource elements.  And to the extent that we have these relationships among virtual resources on different aggregates, those aggregates may need to interact, perhaps through some intermediary.   In GENI we call these interactions "stitching".    Many of the driving use cases for stitching involve interconnecting virtual resources within a slice.  For example, we use stitching to connect virtual network pipes into paths or networks terminating at virtual nodes.
+The key challenge to overcome for partitioning our resource descriptions is that there are relationships among the virtual resource elements.  And to the extent that we have these relationships among virtual resources on different aggregates, those aggregates may need to interact, perhaps through some intermediary.   In GENI we call these interactions [GeniNetworkStitching stitching].    Many of the driving use cases for stitching involve interconnecting virtual resources within a slice.  For example, we use stitching to connect virtual network pipes into paths or networks terminating at virtual nodes.
 Can our semantic models describe all the relationships that might require such interactions?   If the answer is yes, then we can talk to each aggregate about the relationships that cross its borders, without it having to be aware of any virtual resources that are unrelated to what we want that aggregate to do.  The graph is partitionable.   If the answer is no, then we might need to tell every aggregate about every virtual resource, in case there is some important relationship that we missed.  Perhaps there is some relationship that is not represented explicitly in the description, but that an aggregate can infer from the descriptions of other virtual resources at other aggregates.  In that case, the aggregate must have all of those descriptions available to it.  The graph is not partitionable.
 It seems logical that we would seek to describe all such relationships in our semantic resource descriptions.  If we discover that we have missed an important relationship, that means our semantic model is insufficient, and we should go back and extend it or rethink it.
+It is logical that we would seek to describe all such relationships in our semantic resource descriptions.  If we discover that we have missed an important relationship, that means our semantic model is insufficient, and we should go back and extend it or rethink it.
 I believe that these interactions are relatively easy to describe for network virtual resources.   What is difficult is to describe the service that a network virtual resource provides.  But once we describe the service, a relationship is almost always a binding of one virtual resource to the service provided by another.   In networked systems those service endpoints always have names or labels allocated from some network namespace: VLAN tags, IP addresses, ports, DNS names, URLs, LUNs, lambdas, pathnames, alone or in combinations with other identifiers.   What is needed is to describe which virtual resources are providing a service and which are consuming that service.  Then we can bind the consumer to the producer by passing the producer's label to the consumer.  In essence, the graph becomes a directed dependency DAG, with directed edges from producers to consumers.   A stitching agent traverses the DAG, instantiating resources and propagating labels to their successors as the labels become available.  This is how ORCA does stitching.
 We should be able to describe these relationships using our semantic models, and propagate labels by querying descriptions based on those models.  We don't need to write any new code for stitching.  We don't need to describe the producer to the consumer if the consumer already understands what kind of resource or service it wants to bind to.  If it does not, then the configuration of virtual resources is malformed.
-For some symmetric services---such as basic network connectivity through VLANs---it may not be clear from the service description who is the producer and who is the consumer.  In that case, the only distinction is which one chooses or allocates the label.  This may also be difficult to determine when the choice of label requires consensus among multiple aggregates.   But then the problem is not in describing the relationship, but only in describing the process by which the producer chooses the label.   I believe that this is a corner case that is relevant only on legacy networks, and in any case can be handled by introducing intermediaries into the graph, and reverse links propagating labels chosen by those intermediaries in a second stitching pass.  And I don't worry about it much.
 Let us suppose that we succeed in describing the relevant information as pairwise directed relationships in our models.  This is equivalent to saying that we can represent a configuration of virtual resources or virtual resource elements as an entity-relationship graph.  Moreover, we can partition the graph by aggregates, so that each node (vertex) of the graph resides in the partition for the aggregate controlling that node's virtual resource.  Some edges in the graph cross partition boundaries.  These edges require coordination among a pair of aggregates, i.e., stitching.
 …
 Suppose then that we can describe virtual resources of a slice by a graph of elements (entities) and relationships, using a semantic model.  Suppose further that we can partition the graph across aggregates as I have described, so that we talk to each aggregate only about the virtual resource elements that it hosts, and any adjacent edges.
 Now the question is: how does the aggregate expose the graph through its API, so that a slice owner can operate on the graph?   In the current (or near future) AM-API there are simple calls to operate on the graph: create, destroy, and (soon) update.  The create and update operations take as an argument an rspec document describing at least the entire partition of the graph residing at that aggregate.  The requester says what region of the graph they want to operate on somewhere in the rspec document attached to the request, and not in the API.
+Now the question is: how does the aggregate expose the graph through its API, so that a slice owner can operate on the graph?   In the current (or near future) [DRAFT_GAPI_AM_API AM-API] there are simple calls to operate on the graph: create, destroy, and (soon) update.  The create and update operations take as an argument an rspec document describing at least the entire partition of the graph residing at that aggregate.  The requester says what region of the graph they want to operate on somewhere in the rspec document attached to the request, and not in the API.
 The AM-API offers no way to talk to an aggregate about some regions of the graph independently of other regions of the graph.   If we want to add resources, we must pass an rspec for the entire graph, with the new parts added.  If we want to remove resources, we must pass a description for the entire graph, with some parts removed.
 …
 == Slivers ==
 Finally we come to slivers.  What is a sliver?  A sliver is a region of the virtual resource graph describing a slice.  Either the graph is partitionable, or it is not.  If the graph is partitionable, then we need a name for the partitions.  Sliver is a fine name.  But perhaps the community will insist on a different name.
+Finally we come to slivers.  What is a sliver?  A sliver is a region of a slice's virtual resource graph.  Either the graph is partitionable, or it is not.  If the graph is partitionable, then we need a name for the partitions.  Sliver is a fine name.  But perhaps the community will insist on a different name.
 If the graph is not partitionable, then it is not partitionable across aggregates.  Then there are only slices, and each aggregate must receive the entire graph for the entire slice.   If we change any part of the graph, we must pass the new slice rspec to at least all aggregates participating in the slice.  (Why stop there?  Why not pass it to all aggregates in case some aggregate is involved in a way we don't understand?)  We can make this approach work for demos, but it will not scale to large slices, and it will not succeed in accommodating dynamic slices.  And then somebody in another project will figure out how to describe a virtual resource graph in a way that makes it partitionable, and we will move forward again from there.
+If the graph is not partitionable, then it is not partitionable across aggregates.  Then there are only slices, and each aggregate must receive the entire graph for the entire slice.   If we change any part of the graph, we must pass the new slice rspec to at least all aggregates participating in the slice.  We can make this approach work for demos, but it will not scale to large slices, and it will not succeed in accommodating dynamic slices.  And then somebody in another project will figure out how to describe a virtual resource graph in a way that makes it partitionable, and we will move forward again from there.
 I believe that we know how to describe the resources we care about as partitionable graphs.
 …
 We also know that this abstract notion of "regions" must cover the virtual resource cases we already understand.  For example, a cloud site offers an infrastructure service that allows us to instantiate graphs of related virtual resource elements such as virtual CPU cores, memories, network interfaces, storage volumes, and virtual networks.  We can take a pen and draw regions around parts of this richly connected graph of virtual resource elements.   We can decide that the collection of elements adjacent to a memory constitute a useful grouping.  We can draw a region encompassing all of those virtual elements and call it a "virtual machine" or "instance".  That is a reasonable choice of a region: it is the choice made by EC2-like cloud sites.  EC2 also draws regions around VLANs and calls them "security groups".  It considers storage volumes separately from virtual machine instances.
 The [https://ben.renci.org/ Breakable Experimental Network] is another interesting case.  BEN is a network substrate with a multi-layer topology.   We can allocate virtual network topologies from BEN.  Given a virtual network topology that is planar (forget about layers for now), there are many reasonable ways to partition the network into connected regions.   What is important about a region is that offers some connectivity service among a set of locations.   The aggregate might choose to expose more or less information about its internal structure. People who understand network description languages call this topology aggregation.  But it is up to the network aggregate whether it allows its clients to create subnetworks separately and then stitch them together.  Most advanced networks today permit only creation of paths from point A to point B.  If the aggregate does support multi-point network topologies, then it is up to the client how it chooses to use those primitives.   It may be useful for a client to build and evolve a virtual network one piece at a time, or, it might be simpler to create a static network in one shot and then leave it alone.
+The [https://ben.renci.org/ Breakable Experimental Network] is another interesting case.  BEN is a network substrate with a multi-layer topology.   We can allocate virtual network topologies from BEN.  Given a virtual network topology that is planar (forget about layers for now), there are many reasonable ways to partition the network into connected regions.   What is important about a region is that offers some connectivity service among a set of locations.   The aggregate might choose to expose more or less information about its internal structure.  [http://www.science.uva.nl/research/sne/ndl People who understand network description languages] call this topology aggregation.  But it is up to the network aggregate whether it allows its clients to create subnetworks separately and then stitch them together.  Most advanced networks today permit only creation of paths from point A to point B.  If the aggregate does support multi-point network topologies, then it is up to the client how it chooses to use those primitives.   It may be useful for a client to build and evolve a virtual network one piece at a time, or, it might be simpler to create a static network in one shot and then leave it alone.
 == Sliver Types ==
 …
 These examples show that a given virtual resource service incorporates its own groupings of the virtual resource element graph into regions (slivers), and these groupings may allow useful operations on a sliver other than creating it and releasing it.    Thus virtual resources have types that define what we can say about them and do to them.    An aggregate could provide supplementary type-specific operations on slivers, in addition to common operations supported by the base sliver API.   In the past, some have seemed to argue that the impracticality of a one-size-fits-all sliver API undermines the whole dream of GENI.  But the notion of subtyping has been proven in many other contexts and should be comfortable here as well.
 In addition, some virtual resources are programmable, and programs running on them may also expose interfaces and operations.  But in general those interfaces are above the virtual resource management layer and are outside our scope of concern.
+Of course, some virtual resources are programmable, and programs running on them may also expose interfaces and operations.  But in general those interfaces are above the virtual resource management layer and are outside our scope of concern.
 == The Boundary Between Software and Semantic Specifications ==
 …
 Another view of slivers might be "that which the API allows us to name and operate on".  If we want to operate on a virtual resource element that isn't named through the API, then we must name it and operate on it in the rspec for its containing sliver or slice, or whatever the granularity of that rspec is.  If we don't enable type-specific operations on a sliver through the sliver API, then these operations must be represented somehow as verbs in the rspec, or (worse) they won't be supported at all.   Putting verbs in a semantic resource description is a bad idea: if we want to use a language for imperative programming, then we should use an imperative programming language.
 These choices will drive the balance of focus on the API vs. declarative specifications.  In one direction we have a system that uses a few simple API calls to pass around large resource descriptions that are diffed and acted upon in different ways at multiple aggregates.   In the other direction we have a system that uses many calls to a diversity of APIs on a diversity of sliver objects, with each call carrying a small rspec document pertaining to the object being operated on.   The aggregates determine the grouping of virtual resource elements into slivers and the operations supported on the slivers.
+These choices will drive the balance of focus on the API vs. declarative specifications.  In one direction we have a system that uses a few simple API calls to pass around large resource descriptions that are diffed and acted upon in different ways at multiple aggregates.   In the other direction we have a system that uses many calls to a diversity of APIs on a diversity of sliver objects, with each call carrying a small rspec document pertaining to the sliver object being operated on.
 == A Footnote on ORCA ==
 …
 We traveled this line of reasoning some time ago in developing the ORCA system.  And yet ORCA has nothing that we call "slivers".
 ORCA AM calls operate on objects called resource leases.  Leases are time-bounded contracts for one or more units of typed virtual resources.  The units in a lease must have the same type and parameters (e.g., sizes).  These units are the closest analogue to slivers, so let us call them slivers.  The canonical example of a lease is something like "get me 20 large virtual machines for an hour".   (But that is just an example.)
+ORCA AM calls operate on objects called resource leases.  Leases are time-bounded contracts for one or more units of typed virtual resources.  The units in a lease must have the same type and parameters (e.g., sizes).  These units are the closest analogue to slivers, so let us call them slivers.  The canonical example of a resource lease request is something like "get me 20 large virtual machines for an hour".   (But that is just an example.)
 Leases have states and state machine transitions that are independent of the resource type.  (E.g., initializing, active, closing, closed.)  The resource-specific code (setup, teardown) is implemented in pluggable back-end handler scripts that interact with some underlying virtual sliver service, e.g., a cloud middleware system or a network provisioning system.  An aggregate may have many such handlers for different sliver types: an ORCA aggregate is not limited to one type of virtual resource.