Changes between Version 3 and Version 4 of AaronHelsinger/GAPI_AM_API_DRAFT


Ignore:
Timestamp:
03/22/12 14:05:45 (12 years ago)
Author:
Aaron Helsinger
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • AaronHelsinger/GAPI_AM_API_DRAFT

    v3 v4  
    425425== Change Set F3: Sliver Allocation States and methods ==
    426426'''This change was discussed and adopted at the GEC13 Coding Sprint.'''
     427
    427428For meeting minutes, see: [wiki:GEC13Agenda/CodingSprint the GEC13 Coding Sprint agenda page].
    428429
    429  - We agreed to use two kinds of states: allocation states, and operational states. We put off discussion of operational states (ie is the node booted), noting however that this is critical.
     430 - We agreed to use two kinds of states: allocation states, and operational states. We put off discussion of operational states (ie is the node booted), noting however that this is critical. See Change Set F4.
    430431 - We debated whether the API should specify a limited number of states, or allow for aggregate or resource specific states. We agreed that for allocation states, the API should define a limited set of states, while operational states might be more permissive.
    431  - We discussed the pros and cons of including a single all-in-one method to change allocation states, or a single method per desired transition. Rob Ricci noted at least 1 case where there are 2 paths between the same 2 allocation states with very different meaning. As a result, we agreed to use a separate method per allocation state change.
    432 
    433 The key result of the discussion was agreement on 3 allocation states for slivers and enumeration of methods for transitioning between those states. We did not select names for the states or the methods. Here is a diagram with placeholder labels for methods and states, which illustrates the decisions described below.
     432 - We discussed the pros and cons of including a single all-in-one method to change allocation states, or a single method per desired transition. There is at least 1 case where there are 2 paths between the same 2 allocation states with very different meaning. As a result, we agreed to use a separate method per allocation state change.
     433
     434We agreed on 3 allocation states for slivers and an enumeration of methods for transitioning between those states.
    434435
    435436[[Image(sliver-alloc-states.jpg)]]
    436437
    437 We spent a long time discussing allocation states of slivers. For reference, we looked at [https://groups.geni.net/geni/attachment/wiki/GEC13Agenda/AMAPIRevisions/JDuerig-AMAPI-TransactionsAndUpdate.pdf Jon Duerig's slides] from the AM API session (unpresented). We finally agreed there are 2 or 3 or 4 allocation states for slivers, depending on how you count.
    438  1. Start (alternatively called 'null' or 'unallocated'). The sliver does not exist. This is the small black circle in typical state diagrams.
    439  2. Allocated (alternatively called 'offered' or 'promised'). The sliver exists, defines particular resources, and is in a sliver. The aggregate has not (if possible) done any time consuming or expensive work to instantiate the resources, provision them, or make it difficult to revert the slice to the state prior to allocating this sliver. This state is what the aggregate is offering the experimenter.
    440  X. ~~Accepted.~~ We chose NOT to include this intermediary state, occasionally called 'accepted', where the experimenter has accepted the aggregate's offer of resources, but the resources have still not been provisioned.
    441  3. Provisioned. The aggregate has started instantiating resources, and otherwise making changes to resources and the slice to make the resources available to the experimenter. At this point, operational states are valid to specify further when the resources are available for experimenter use.
    442 
    443 Having ruled out the 'accepted' state as unnecessary, we were left with 3 states, the first being the 'null' state. We spent a long time clarifying the semantics of each state, but could not quite agree on names for these states. We took to referring to the states by number, leaving the honor of naming the states to the API documenter.
    444 
    445 The key change is the addition of state 2, representing resources that have been allocated to a slice without provisioning the resources. This represents a cheap and un-doable resource allocation, such as we previously discussed in the context of tickets. This compares reasonably well to the 'transaction' proposal written up by Gary Wong (http://www.protogeni.net/trac/protogeni/wiki/AM_API_proposals). When a sliver is created and moved into state 2, the aggregate produces a manifest RSpec identifying which resources are included in the sliver. This is something like the current !CreateSlivers, except that it does not provision nor start the resources. These resources are exclusively available to the containing sliver, but are not ready for use. In particular, allocating a sliver should be a cheap and quick operation, which the aggregate can readily un-do without impacting the state of slivers which are fully provisioned. For some aggregates, transitioning to this state may be a no-op.
    446 
    447 States 2 and 3 have aggregate and possibly resource specific timeouts. By convention the state 2 timeout is typically short, like the {{{redeem_before}}} in ProtoGENI tickets, or the {{{commit_by}}} in Gary's transactions proposal. The state 3 timeout is the existing sliver expiration. If the client does not transition the sliver from state 2 to 3 before the end of the state 2 timeout, the sliver reverts to unallocated. If the experimenter needs more time, the experimenter should be allowed to request a renewal of either timeout.  Note that typically the sliver expiration time (timeout for state 3, provisioned) will be notably longer than the timeout for state 2, allocated.
    448 
    449 The AM API does not yet have a method for moving from state 2, to state 3. State 3 is the state of the sliver allocation after the aggregate begins to instantiate the sliver. Note that fully provisioning a sliver may take noticeable time. This state also includes a timeout - the sliver expiration time (which is not necessarily related to the time it takes to provision a resource). !RenewSlivers extends this timeout. For some aggregates and resource types, moving to this state from state 2 (allocated) may be a no-op.
     438Allocation states:
     439 1. `geni_unallocated` (alternatively called 'null'). The sliver does not exist. This is the small black circle in typical state diagrams.
     440 2. `geni_allocated` (alternatively called 'offered' or 'promised'). The sliver exists, defines particular resources, and is in a sliver. The aggregate has not (if possible) done any time consuming or expensive work to instantiate the resources, provision them, or make it difficult to revert the slice to the state prior to allocating this sliver. This state is what the aggregate is offering the experimenter.
     441 3. `geni_provisioned`. The aggregate has started instantiating resources, and otherwise making changes to resources and the slice to make the resources available to the experimenter. At this point, operational states are valid to specify further when the resources are available for experimenter use.
     442
     443The key change is the addition of state 2, representing resources that have been allocated to a slice without provisioning the resources. This represents a cheap and un-doable resource allocation, such as we previously discussed in the context of tickets. This compares reasonably well to the 'transaction' proposal written up by Gary Wong (http://www.protogeni.net/trac/protogeni/wiki/AM_API_proposals). When a sliver is created and moved into state 2 (`geni_allocated`), the aggregate produces a manifest RSpec identifying which resources are included in the sliver. This is something like the current !CreateSlivers, except that it does not provision nor start the resources. These resources are exclusively available to the containing sliver, but are not ready for use. In particular, allocating a sliver should be a cheap and quick operation, which the aggregate can readily un-do without impacting the state of slivers which are fully provisioned. For some aggregates, transitioning to this state may be a no-op.
     444
     445States 2 and 3 (`geni_allocated` and `geni_provisioned`) have aggregate and possibly resource specific timeouts. By convention the `geni_allocated` state timeout is typically short, like the {{{redeem_before}}} in ProtoGENI tickets, or the {{{commit_by}}} in Gary's transactions proposal. The `geni_provisioned` state timeout is the existing sliver expiration. If the client does not transition the sliver from `geni_allocated` to `geni_provisioned` before the end of the `geni_allocated` state timeout, the sliver reverts to `geni_unallocated`. If the experimenter needs more time, the experimenter should be allowed to request a renewal of either timeout.  Note that typically the sliver expiration time (timeout for state 3, `geni_provisioned`) will be notably longer than the timeout for state 2, `geni_allocated`.
     446
     447State 3, `geni_provisioned`, is the state of the sliver allocation after the aggregate begins to instantiate the sliver. Note that fully provisioning a sliver may take noticeable time. This state also includes a timeout - the sliver expiration time (which is not necessarily related to the time it takes to provision a resource). !RenewSlivers extends this timeout. For some aggregates and resource types, moving to this state from state 2 (`geni_allocated`) may be a no-op.
    450448
    451449If the transition from one state to another fails, the sliver shall remain in its original state.
    452450
    453 These are the only operational states supported by this API. Since the state transitions are finite, but include potentially multiple transitions between the same 2 states, this API uses separate methods to perform each state transition, rather than a single method for requesting a new state for the sliver. We did not agree on method names for these transitions (we agreed to leave it as an exercise for the API documenter). Logically however these methods are something like:
    454  1. !CreateSlivers moves 1+ slivers from unallocated (state 1)  to allocated (state 2). This method can be described as creating an instance of the state machine for each sliver. If the aggregate cannot fully satisfy the request, the whole request fails. This is a change from the version 2 !CreateSliver, which also provisioned the resources, and 'started' them. That is !CreateSlivers does 1 of the 3 things that it did previously.
    455  2. !DeleteSlivers moves 1+ slivers from either state 2 or 3, back to state 1. This is similar to the AM API version 2 !DeleteSliver.
    456  3. !RenewSomething (name TBD) requests an extended timeout for slivers in state 2 - the allocated but not provisioned state.
    457  4. !RenewSlivers requests an extended timeout for slivers in state 3 - the provisioned state, as before.
    458  5. !SomethingSlivers (name TBD) moves 1+ slivers from state 2 (allocated) to state 3 (provisioned). This is some of what version 2 !CreateSliver did. Note however that this does not 'start' the resources, or otherwise change their operational state. This method only fully instantiates the resources in the slice. This may be a no-op for some aggregates or resources.
    459 
    460 These states apply to each sliver individually. Logically, the state transition methods then take a single sliver URN. For convenience, we agreed to allow a list of sliver URNs, or a slice URN as a simple alias for all slivers in this slice at this aggregate.
     451These are the only operational states supported by this API. Since the state transitions are finite, but include potentially multiple transitions between the same two states, this API uses separate methods to perform each state transition, rather than a single method for requesting a new state for the sliver.
     452 1. !CreateSlivers moves 1+ slivers from `geni_unallocated` (state 1)  to `geni_allocated` (state 2). This method can be described as creating an instance of the state machine for each sliver. If the aggregate cannot fully satisfy the request, the whole request fails. This is a change from the version 2 !CreateSliver, which also provisioned the resources, and 'started' them. That is !CreateSlivers does 1 of the 3 things that it did previously. Note the method name change, consistent with change set D.
     453 2. !DeleteSlivers moves 1+ slivers from either state 2 or 3 (`geni_allocated` or `geni_provisioned`), back to state 1 (`geni_unallocated`). This is similar to the AM API version 2 !DeleteSliver. Note the method name change, consistent with change set D.
     454 3. !RenewAllocations requests an extended timeout for slivers in state 2 (`geni_allocated`).
     455 4. !RenewSlivers requests an extended timeout for slivers in state 3 - the `geni_provisioned` state. That is, this method's semantics does not change. Note the method name change, consistent with change set D.
     456 5. !ProvisionSlivers moves 1+ slivers from state 2 (`geni_allocated`) to state 3 (`geni_provisioned`). This is some of what version 2 !CreateSliver did. Note however that this does not 'start' the resources, or otherwise change their operational state. This method only fully instantiates the resources in the slice. This may be a no-op for some aggregates or resources.
     457
     458These states apply to each sliver individually. Logically, the state transition methods then take a single sliver URN. For convenience, these methods accept a list of sliver URNs, or a slice URN as a simple alias for all slivers in this slice at this aggregate.
    461459
    462460Since each method may operate on multiple slivers, each of these methods returns a list of structs as the value:
     
    468466   geni_expires <time when the sliver expires from its current state>,
    469467   <others AM or method specific>
    470    <Method 5 SomethingSlivers returns geni_operational_status>
     468   <ProvisionSlivers returns geni_operational_status>
    471469  },
    472470  ...
     
    476474!CreateSlivers returns a single manifest RSpec, plus the above list of structs.
    477475
    478 We spent a while discussing what it means for an aggregate to operate on multiple slivers at once. Can the aggregate partially succeed? What will experimenters want? We agreed that aggregates must be consistent across all these methods whether they are all or nothing, or support partial success.
     476Aggregates must be consistent across all these methods whether they are all or nothing, or support partial success.
    479477
    480478These methods all take a new option (aggregates must support it, clients do not need to supply it):
     
    489487'''Note''': !CreateSlivers remains all or nothing (either the aggregate can allocate all desired resources as requested, or the call fails).
    490488
    491 '''Note''': These calls are synchronous - when they return, the slivers shall be in their final state. In particular, the transition from state 2 to 3 (allocated to provisioned) should be quick. The resource that is now in the 'provisioned' state may take a long time to actually be ready for operational use (e.g. imaging and booting the node) -- this remains true as in version 2 after !CreateSliver.
     489'''Note''': These calls are synchronous - when they return, the slivers shall be in their final state. In particular, the transition from state 2 to 3 (`geni_allocated` to `geni_provisioned`) should be quick. The resource that is now in the 'provisioned' state may take a long time to actually be ready for operational use (e.g. imaging and booting the node) -- this remains true as in version 2 after !CreateSliver.
    492490
    493491!SliverStatus, where it currently includes {{{geni_status}}}, shall now return {{{geni_allocation_status}}} with one of the above defined values, and {{{geni_operational_status}}}. The values of {{{geni_operational_status}}} are still under discussion.
    494492
    495493Currently, !SliverStatus returns a single {{{geni_status}}} for the entire slice at this aggregate. With this change, the top-level allocation status for the slice is not defined, and that field is not required.
    496 
    497494
    498495== Change Set F4: Sliver Operational States and methods ==
     
    500497
    501498The canonical source for documentation on this proposal is here: https://openflow.stanford.edu/display/FOAM/GENI+-+PerformOperationalAction
     499
     500Open questions include:
     501 - Do we need a state discovery mechanism? A method? An RSpec ad extension?
     502 - Should the method default to all or nothing? Or to partial success?
     503 - What are the defined operational states, and GENI standard actions for transitions?
     504 - Does the API specify how operational state of slivers rolls up to the state of the slice at the aggregate?
     505 - Make option names and returns consistent between this change set and change set F3
     506
     507{{{
     508PerformOperationalAction (string urn[], string credentials[], string action, struct options={})
     509}}}
     510
     511{{{
     512{
     513  struct code = {
     514       int geni_code;
     515       [optional: string am_type;]
     516       [optional: int am_code;]
     517         }
     518  struct value = [ {
     519        'sliver_urn' : string,
     520        'geni_operational_status' : string,
     521        [optional: 'geni_resource_status' : string]
     522        }, ... ];
     523  string output;
     524}
     525}}}
     526
     527Performs the given action on the given sliver_urn(s) (or slice_urn as a proxy for "all slivers").  Actions are constrained to the set of default GENI actions, as well as resource-specific actions which reasonably perform operational resource tasks as defined by the aggregate manager for the given resource type.  This method is not intended to allow for reconfiguration of options found in the request rspec.  Aggregate Managers SHOULD return an error code of `13` (`UNSUPPORTED`) if they do not support a given action for a given resource. Actions are performed on all slivers, or none - if an action cannot be performed on a sliver given, the entire operation MUST fail. Passing the option `geni_best_effort` allows for partial success.
     528
     529An AM SHOULD constrain actions based on the current operational state of the resource, such that attempting to perform the action `geni_stop` on a resource that is `geni_busy` will fail, but SHOULD also be idempotent for all actions which result in a steady state.
     530
     531`geni_operational_status` MUST be the current operational status of the sliver after this action (as would be returned by !SliverStatus). An optional `geni_resource_status` field MAY be returned for each sliver which contains a resource-specific status that may be more nuanced than the options for `geni_operational_status`.
     532
     533Calling this method with a slice_urn functions as if all the child sliver_urn had been passed in - specifically the action is performed on all slivers and all sliver_urn and their statuses are returned. No status is returned for the slice as a whole.
     534
     535This is a fast synchronous operation, and MAY start long-running sliver transitions whose status can be queried using !SliverStatus.
    502536
    503537= Adopted: Change Set G: Credentials are general authorization tokens. =
     
    735769
    736770= Change Set L: Change SFA credentials' privileges =
    737 '''This proposal has not been fully discussed.'''
     771'''This proposal has not been fully discussed and is likely to be postponed.'''
    738772
    739773Our goal is to simplify and standardize privilege strings used in SFA credentials. Currently there are wildcards, bind, embed, and others. They are confusing. We also want extensibility to use these credentials elsewhere in future.
     
    789823
    790824 - Split !ListResources with {{{slice_urn}}} from !ListResources without. !ListResources with {{{slice_urn}}} we call Resolve(slice_urn).
    791  - Add {{{geni_credential_types}}} to !GetVersion return?
    792825 - Add {{{geni_am_info}}} block to !GetVersion return (name, id, url, location, description, is_proxy, proxy: {(a geni_am_info block)}, proxy_for[] (list of geni_am_info blocks))
    793  - Define a new option {{{geni_am_id}}} for all methods. Proxy AMs may use this ID to look up the URL of the real AM, and pass the call along.
    794   - Instead just encode this am_id in the URL that the experimenter accesses, so that the proxy AM knows where to direct the call.
    795826 - Allow the update methods to take a generic rspec argument, allowing AMs to accept full or diff RSpecs
    796827 - Tickets
     
    806837 - Advertise available VLANs, attempt to honor requested VLANs, reserve VLANs with tickets, and report reserved or instantiated VLANs in manifests
    807838 - Use schema http://hpn.east.isi.edu/rspec/ext/stitch/0.1/stitch-schema.xsd
     839
     840Update stitching schema per changes here: https://geni.maxgigapop.net/twiki/bin/view/GENI/NetworkStitchingGeniApiAndRspec
    808841
    809842----