414 | | {{{geni_status}}} legal values (case insensitive): |
415 | | - {{{uninitialized}}}: This is the state before any AM-local operation for this slice. |
416 | | - {{{ticketed}}}: The resources are reserved for the slice, but not currently provisioned for the slice. Slivers are {{{ticketed}}} after !GetTicket, !UpdateTicket, or after !UpdateSlivers. Note in particular that a slice may have some resources that are {{{ready}}} and others which are {{{ticketed}}} after an !UpdateSlivers call: we call the whole slice {{{ticketed}}} in this case. |
417 | | - {{{allocated}}}: The sliver(s) are currently provisioned for the slice, but not necessarily fully ready for experimental use (eg, not booted). This is the state after !RedeemTicket, or after !CreateSlivers with the {{{geni_donotstart}}} option. |
418 | | - {{{ready}}}: The resources are ready for experimental use, as in after !CreateSlivers completes any booting or starting. Similarly after !ActOnSlivers with the {{{start}}} command. Note that each of those methods starts a process that may take significant time to complete. During that time the sliver will not yet be {{{ready}}}. |
419 | | - {{{closed}}}: When the slice was previously provisioned resources, which have now expired or been de-allocated with !DeleteSlivers, we call the sliver {{{closed}}}. Note that this state is rarely seen in practice - aggregates do not respond in this API to queries about slices that do not currently have outstanding allocations or tickets. |
420 | | - {{{changing}}}: This is the state of a sliver in transition. For example, while a machine is booting (changing from {{{allocated}}} to {{{ready}}}). This state used to be known as {{{configuring}}}. |
421 | | - {{{shutdown}}}: This is the state of a sliver after the Shutdown operation - the sliver is still allocated to the slice, possibly still booted and configured for the slice, but is not available for experimental use. And administrator must intervene to recover or delete the slivers. |
422 | | - {{{failed}}}: When an operation fails leaving the sliver unusable and requiring administrative intervention, it will be marked {{{failed}}}. |
423 | | - {{{unknown}}}: If the aggregate does not know the state of a sliver, it will be marked {{{unknown}}}. This state may be transitive, or may require an admin to recover. |
424 | | |
425 | | As in previous versions of this API, the state of the full set of slivers in a slice at an aggregate is a roll-up of the states of each sliver. For each of {{{ticketed}}}, {{{allocated}}}, and {{{ready}}}, the set of slivers is only in that state if all individual slivers are in that state. If any sliver is {{{shutdown}}} or {{{failed}}} or {{{changing}}} (in order of decreasing precedence), then the set of slivers is in that state. If all slivers are {{{unknown}}} or {{{closed}}}, then the slice at this aggregate is {{{unknown}}} or {{{closed}}}. |
426 | | - If not all resources in the sliver/slice can be moved to the desired next state, then the call fails. |
427 | | - When moving from state 1 to 2, the slice is in state 1 until all slivers are in state 2 (EG moving from {{{ticketed}}}->{{{allocated}}}). |
428 | | |
429 | | Aggregates are free to ''also'' return an aggregate specific status - either in an AM-specifically-named entry, or in {{{am_specific_status}}}. Such values should be thought of as sub-states within the GENI state. For example, where the GENI state might be {{{changing}}}, the AM specific state might also be {{{imaging}}} or {{{booting}}}. Methods which accept a state (!ActOnSlivers) may accept either one of the {{{geni_status}}} values, or an aggregate specific value. Aggregates must document the meaning and use of aggregate specific status values. |
430 | | |
431 | | State changes by method: |
432 | | - !GetTicket: From {{{uninitialized}}} to {{{ticketed}}} |
433 | | - !UpdateTicket: From {{{ticketed}}} to {{{ticketed}}} |
434 | | - !ReleaseTicket: From {{{ticketed}}} to {{{uninitialized}}} (or {{{allocated}}} if this was an update) |
435 | | - !RedeemTicket: From {{{ticketed}}} to {{{allocated}}} |
436 | | - !CreateSlivers: From {{{uninitialized}}} to {{{allocated}}} |
437 | | - And then to {{{ready}}} via {{{changing}}} if the {{{geni_donotstart}}} option is not supplied |
438 | | - !UpdateSlivers: From {{{allocated}}} or {{{ready}}} to {{{ticketed}}} |
439 | | - !DeleteSlivers: From {{{ready}}} or {{{allocated}}} (or {{{changing}}}, etc) to {{{closed}}} (not {{{ticketed}}}) |
440 | | - Shutdown: From {{{allocated}}} or {{{ready}}} to {{{shutdown}}} |
441 | | Note that {{{changing}}} or {{{unknown}}} may be a source state for any of these methods. Operations may fail, leaving a sliver {{{failed}}}, and operations may take time leaving a sliver {{{changing}}} for some time. |
442 | | |
443 | | Methods and state transitions as a picture: |
444 | | |
445 | | [[Image(sliver-states.jpg)]] |
446 | | |
447 | | {{{ |
448 | | #!comment |
449 | | - {{{uninitialized}}} -> (!GetTicket) -> {{{ticketed}}} (you have a ticket) |
450 | | - and back via !ReleaseTicket |
451 | | - {{{ticketed -> (!UpdateTicket) -> {{{ticketed}}} |
452 | | - {{{ticketed -> (!RedeemTicket) -> {{{allocated}}} (you have slivers) |
453 | | - {{{uninitialized}}} -> (!CreateSlivers) -> {{{allocated}}} and then via {{{changing}}} to {{{ready}}} |
454 | | - {{{allocated}}} (or {{{ticketed}}} when you also have {{{allocated}}} slivers)->(!DeleteSlivers) -> {{{closed}}} |
455 | | - {{{allocated}}} (or some {{{allocated}}} and some {{{ticketed}}}}) -> (Shutdown) -> {{{shutdown}}} |
456 | | - {{{shutdown}}} -> [some operator action] -> {{{closed}}} or {{{allocated}}} |
457 | | - {{{allocated}}} -> (!UpdateSlivers) -> whole is called {{{ticketed}}}, some slivers are {{{allocated}}} and some {{{ticketed}}} |
458 | | - Some slivers {{{allocated}} and some {{{ticketed}}} -> (!UpdateTicket) -> {{{allocated}}}+{{{ticketed}}} |
459 | | - {{{allocated}}}+{{{ticketed}}} -> (!ReleaseTicket) -> {{{allocated}}} |
460 | | - {{{allocated}}}+{{{ticketed}}} -> (!RedeemTicket) -> {{{allocated}}} |
461 | | }}} |
462 | | |
463 | | Note: some resources may not require an explicit 'start' operation. In this case !CreateSlivers may leave some slivers {{{ready}}}, skipping right past {{{allocated}}}. |
464 | | |
465 | | Summary of changes: |
466 | | - {{{configuring}}} becomes {{{changing}}}, which can be used in many other cases, in returns from !SliversStatus |
467 | | - New states {{{uninitialized}}}, {{{ticketed}}}, {{{allocated}}}, {{{closed}}}, and {{{shutdown}}} are added |
468 | | - State transitions for each method are defined |
469 | | - {{{am_specific_status}}} optional return defined |
470 | | - {{{geni_status}}} is returned by !SliversStatus, !RedeemTicket, !UpdateSlivers, !CreateSlivers, and !ActOnSlivers (if all relevant change sets are adopted). |
471 | | |
475 | | !ActOnSlivers takes a {{{command}}}, {{{urn}}}, {{{state}}}, and {{{options}}}. The method return is a struct that includes the {{{urn}}}, {{{geni_status}}} of the sliver(s), and any other AM and operation specific options. The URN may be a slice urn, meaning all slivers in that slice at this AM are effected. Or the URN may be a particular sliver URN. The {{{state}}} argument is one of the {{{geni_status}}} values, or an AM-specific value. The {{{state}}} meaning depends on the {{{command}}}, but typically indicates the desired or resulting new state of the sliver(s). If the AM wishes to return an aggregate specific sliver status, it should still return a valid {{{geni_status}}}, and use an additional entry to also return the aggregate specific state. The {{{command}}} argument is aggregate defined. This API does not specify how aggregates advertise valid commands. |
476 | | |
477 | | Three particular commands are specified however: {{{start}}}, {{{stop}}}, and {{{restart}}} (case insensitive). If an aggregate provides resources which require an explicit action to make {{{allocated}}} resources {{{ready}}} for experimenter use (booting, applying a configuration change) then the aggregate must make that operation available using these commands. These commands are used after !RedeemSlivers or when the {{{geni_donotstart}}} option is supplied to !CreateSlivers for example. |
478 | | |
479 | | For example, to start allocated resources: |
480 | | {{{ |
481 | | Arguments: |
482 | | command = start |
483 | | urn = <slice or sliver urn> |
484 | | state = ready |
485 | | options = <none required> |
486 | | Result: |
487 | | urn = <same as input> |
488 | | geni_status = changing or ready on success |
489 | | }}} |
490 | | |
491 | | FIXME: After !UpdateSlivers, does {{{start}}} on the slice start only new stuff? How do changes to existing resources take effect? Does {{{restart}}} on the slice restart everything or only changed things? Must the experimenter selectively {{{restart}}} changed things and use {{{start}}} to start new things? |
492 | | |
493 | | Method signature: |
494 | | {{{ |
495 | | struct ActOnSlivers(string command, string credentials[], string urn, string state, struct options) |
496 | | }}} |
497 | | |
498 | | Return struct: |
499 | | {{{ |
500 | | { |
501 | | string urn=<urn of sliver or slice>, |
502 | | string geni_status=<new state of the slivers>, |
503 | | <other entries specific to the AM or resources - specifically am_specific_status> |
504 | | } |
505 | | }}} |
| 425 | == Change Set F3: Sliver Allocation States and methods == |
| 426 | '''This change was discussed and adopted at the GEC13 Coding Sprint.''' |
| 427 | For meeting minutes, see: [wiki:GEC13Agenda/CodingSprint the GEC13 Coding Sprint agenda page]. |
| 428 | |
| 429 | - We agreed to use two kinds of states: allocation states, and operational states. We put off discussion of operational states (ie is the node booted), noting however that this is critical. |
| 430 | - We debated whether the API should specify a limited number of states, or allow for aggregate or resource specific states. We agreed that for allocation states, the API should define a limited set of states, while operational states might be more permissive. |
| 431 | - We discussed the pros and cons of including a single all-in-one method to change allocation states, or a single method per desired transition. Rob Ricci noted at least 1 case where there are 2 paths between the same 2 allocation states with very different meaning. As a result, we agreed to use a separate method per allocation state change. |
| 432 | |
| 433 | The key result of the discussion was agreement on 3 allocation states for slivers and enumeration of methods for transitioning between those states. We did not select names for the states or the methods. Here is a diagram with placeholder labels for methods and states, which illustrates the decisions described below. |
| 434 | |
| 435 | [[Image(sliver-alloc-states.jpg)]] |
| 436 | |
| 437 | We spent a long time discussing allocation states of slivers. For reference, we looked at [https://groups.geni.net/geni/attachment/wiki/GEC13Agenda/AMAPIRevisions/JDuerig-AMAPI-TransactionsAndUpdate.pdf Jon Duerig's slides] from the AM API session (unpresented). We finally agreed there are 2 or 3 or 4 allocation states for slivers, depending on how you count. |
| 438 | 1. Start (alternatively called 'null' or 'unallocated'). The sliver does not exist. This is the small black circle in typical state diagrams. |
| 439 | 2. Allocated (alternatively called 'offered' or 'promised'). The sliver exists, defines particular resources, and is in a sliver. The aggregate has not (if possible) done any time consuming or expensive work to instantiate the resources, provision them, or make it difficult to revert the slice to the state prior to allocating this sliver. This state is what the aggregate is offering the experimenter. |
| 440 | X. ~~Accepted.~~ We chose NOT to include this intermediary state, occasionally called 'accepted', where the experimenter has accepted the aggregate's offer of resources, but the resources have still not been provisioned. |
| 441 | 3. Provisioned. The aggregate has started instantiating resources, and otherwise making changes to resources and the slice to make the resources available to the experimenter. At this point, operational states are valid to specify further when the resources are available for experimenter use. |
| 442 | |
| 443 | Having ruled out the 'accepted' state as unnecessary, we were left with 3 states, the first being the 'null' state. We spent a long time clarifying the semantics of each state, but could not quite agree on names for these states. We took to referring to the states by number, leaving the honor of naming the states to the API documenter. |
| 444 | |
| 445 | The key change is the addition of state 2, representing resources that have been allocated to a slice without provisioning the resources. This represents a cheap and un-doable resource allocation, such as we previously discussed in the context of tickets. This compares reasonably well to the 'transaction' proposal written up by Gary Wong (http://www.protogeni.net/trac/protogeni/wiki/AM_API_proposals). When a sliver is created and moved into state 2, the aggregate produces a manifest RSpec identifying which resources are included in the sliver. This is something like the current !CreateSlivers, except that it does not provision nor start the resources. These resources are exclusively available to the containing sliver, but are not ready for use. In particular, allocating a sliver should be a cheap and quick operation, which the aggregate can readily un-do without impacting the state of slivers which are fully provisioned. For some aggregates, transitioning to this state may be a no-op. |
| 446 | |
| 447 | States 2 and 3 have aggregate and possibly resource specific timeouts. By convention the state 2 timeout is typically short, like the {{{redeem_before}}} in ProtoGENI tickets, or the {{{commit_by}}} in Gary's transactions proposal. The state 3 timeout is the existing sliver expiration. If the client does not transition the sliver from state 2 to 3 before the end of the state 2 timeout, the sliver reverts to unallocated. If the experimenter needs more time, the experimenter should be allowed to request a renewal of either timeout. Note that typically the sliver expiration time (timeout for state 3, provisioned) will be notably longer than the timeout for state 2, allocated. |
| 448 | |
| 449 | The AM API does not yet have a method for moving from state 2, to state 3. State 3 is the state of the sliver allocation after the aggregate begins to instantiate the sliver. Note that fully provisioning a sliver may take noticeable time. This state also includes a timeout - the sliver expiration time (which is not necessarily related to the time it takes to provision a resource). !RenewSlivers extends this timeout. For some aggregates and resource types, moving to this state from state 2 (allocated) may be a no-op. |
| 450 | |
| 451 | If the transition from one state to another fails, the sliver shall remain in its original state. |
| 452 | |
| 453 | These are the only operational states supported by this API. Since the state transitions are finite, but include potentially multiple transitions between the same 2 states, this API uses separate methods to perform each state transition, rather than a single method for requesting a new state for the sliver. We did not agree on method names for these transitions (we agreed to leave it as an exercise for the API documenter). Logically however these methods are something like: |
| 454 | 1. !CreateSlivers moves 1+ slivers from unallocated (state 1) to allocated (state 2). This method can be described as creating an instance of the state machine for each sliver. If the aggregate cannot fully satisfy the request, the whole request fails. This is a change from the version 2 !CreateSliver, which also provisioned the resources, and 'started' them. That is !CreateSlivers does 1 of the 3 things that it did previously. |
| 455 | 2. !DeleteSlivers moves 1+ slivers from either state 2 or 3, back to state 1. This is similar to the AM API version 2 !DeleteSliver. |
| 456 | 3. !RenewSomething (name TBD) requests an extended timeout for slivers in state 2 - the allocated but not provisioned state. |
| 457 | 4. !RenewSlivers requests an extended timeout for slivers in state 3 - the provisioned state, as before. |
| 458 | 5. !SomethingSlivers (name TBD) moves 1+ slivers from state 2 (allocated) to state 3 (provisioned). This is some of what version 2 !CreateSliver did. Note however that this does not 'start' the resources, or otherwise change their operational state. This method only fully instantiates the resources in the slice. This may be a no-op for some aggregates or resources. |
| 459 | |
| 460 | These states apply to each sliver individually. Logically, the state transition methods then take a single sliver URN. For convenience, we agreed to allow a list of sliver URNs, or a slice URN as a simple alias for all slivers in this slice at this aggregate. |
| 461 | |
| 462 | Since each method may operate on multiple slivers, each of these methods returns a list of structs as the value: |
| 463 | {{{ |
| 464 | value = [ |
| 465 | { |
| 466 | geni_sliver_urn, |
| 467 | geni_allocation_status, |
| 468 | geni_expires <time when the sliver expires from its current state>, |
| 469 | <others AM or method specific> |
| 470 | <Method 5 SomethingSlivers returns geni_operational_status> |
| 471 | }, |
| 472 | ... |
| 473 | ] |
| 474 | }}} |
| 475 | |
| 476 | !CreateSlivers returns a single manifest RSpec, plus the above list of structs. |
| 477 | |
| 478 | We spent a while discussing what it means for an aggregate to operate on multiple slivers at once. Can the aggregate partially succeed? What will experimenters want? We agreed that aggregates must be consistent across all these methods whether they are all or nothing, or support partial success. |
| 479 | |
| 480 | These methods all take a new option (aggregates must support it, clients do not need to supply it): |
| 481 | {{{ |
| 482 | geni_atomic = True/False, default True |
| 483 | }}} |
| 484 | If true, the client is requesting that the aggregate either fully satisfy the request, moving all listed slivers to the desired state, or fully fail the request, leaving all slivers in their original state. |
| 485 | If the aggregate cannot guarantee all or nothing success or failure given the included slivers and resource types, the aggregate shall fail the request, returning an appropriate error code. If this option is false, then some slivers may transition to the new state, and some note. Aggregates must examine the return closely to know the state of their slivers. |
| 486 | |
| 487 | [FIXME: Is this what we agreed to? Or should the default be False? Should there be a !GetVersion option for advertising the AM default?] |
| 488 | |
| 489 | '''Note''': !CreateSlivers remains all or nothing (either the aggregate can allocate all desired resources as requested, or the call fails). |
| 490 | |
| 491 | '''Note''': These calls are synchronous - when they return, the slivers shall be in their final state. In particular, the transition from state 2 to 3 (allocated to provisioned) should be quick. The resource that is now in the 'provisioned' state may take a long time to actually be ready for operational use (e.g. imaging and booting the node) -- this remains true as in version 2 after !CreateSliver. |
| 492 | |
| 493 | !SliverStatus, where it currently includes {{{geni_status}}}, shall now return {{{geni_allocation_status}}} with one of the above defined values, and {{{geni_operational_status}}}. The values of {{{geni_operational_status}}} are still under discussion. |
| 494 | |
| 495 | Currently, !SliverStatus returns a single {{{geni_status}}} for the entire slice at this aggregate. With this change, the top-level allocation status for the slice is not defined, and that field is not required. |
| 496 | |
| 497 | |
| 498 | == Change Set F4: Sliver Operational States and methods == |
| 499 | This proposal was discussed on the geni-dev mailing list: http://lists.geni.net/pipermail/dev/2012-March/000743.html |
| 500 | |
| 501 | The canonical source for documentation on this proposal is here: https://openflow.stanford.edu/display/FOAM/GENI+-+PerformOperationalAction |