| 96 | |
| 97 | === Return Struct === |
| 98 | |
| 99 | {{{code}}}, {{{value}}}, and {{{output}}} together provide the standard return from all AM API methods. |
| 100 | |
| 101 | `code`:: |
| 102 | A struct indicating the success or failure of this call at the Aggregate Manager. It consists of 1 required field and 2 optional fields. |
| 103 | {{{ |
| 104 | struct code = { |
| 105 | int geni_code; |
| 106 | [optional: string am_type;] |
| 107 | [optional: int am_code;] |
| 108 | } |
| 109 | }}} |
| 110 | |
| 111 | `value`:: |
| 112 | Method-specific. Required on success. Optional on error. |
| 113 | |
| 114 | `output`:: |
| 115 | On failure or error, this is required. Optional on success. This is an XML-RPC string with a human readable message explaining the result. Specifically, this might include an error string, a stacktrace, or other useful messages to help the experimenter resolve or report the failure or error. It is not defined on success, though aggregates are free to use it. |
| 116 | |
| 117 | Implementations can add additional members to the return struct as desired. The prefix {{{geni_}}} is reserved for members that are part of this API specification. Implementations should choose an appropriate prefix to avoid conflicts. Aggregates should [#DocumentingAggregateAdditions document any additional return values]. |
| 118 | |
| 119 | |
| 120 | Aggregates shall return consistent values for {{{geni_code}}} as described here. Aggregates wishing to be more specific may use the {{{am_type}}} and {{{am_code}}} values. |
| 121 | |
| 122 | Success is always indicated using a {{{geni_code}}} value of {{{0}}}. |
| 123 | |
| 124 | On one of the error or failure cases listed in the table below, aggregates shall return the indicated error code. |
| 125 | |
| 126 | |
| 127 | ==== Elements in {{{code}}} ==== |
| 128 | `geni_code`:: |
| 129 | An integer supplying the GENI standard return code indicating the success or failure of this call. Error codes are standardized and defined [attachment:geni-error-codes.xml in the attached XML document]. Codes may be negative. A success return is defined as {{{geni_code}}} of {{{0}}}. |
| 130 | |
| 131 | `am_type`:: |
| 132 | Optional. A (case insensitive) string indicating the type of Aggregate Manager running locally. When an aggregate wants to return an aggregate specific return code in the {{{am_code}}} field, they supply an {{{am_type}}} to qualify the kind of aggregate specific return code they are supplying. This is the namespace of the aggregate specific return code. This field is optional: aggregates are not required to supply an aggregate specific return code, and clients need not look at it. This code further qualifies the kind of error or success that the aggregate is returning, as primarily defined by the value of {{{geni_code}}}. Standard values for {{{am_type}}} are defined [attachment:geni-am-types.xml in the attached XML document]. |
| 133 | |
| 134 | `am_code`:: |
| 135 | An integer supplying the more specific return code, relative to the aggregate type specified in {{{am_type}}}. This integer may be negative. Aggregates should document these codes publicly. This API does not specify how or where that documentation should be provided. |
| 136 | |
| 137 | Aggregates are encouraged to use {{{code}}} values and {{{output}}} messages that help experimenters and tools distinguish between bad input, other experimenter error, temporary server errors, or server bugs. |
| 138 | |
| 139 | GENI standard error codes are documented in the [attachment:geni-error-codes.xml attached XML document], and listed below. |
| 140 | |
| 141 | || 0 || SUCCESS || "Success" || |
| 142 | || 1 || BADARGS || "Bad Arguments: malformed arguments" || |
| 143 | || 2 || ERROR || "Error (other)" || |
| 144 | || 3 || FORBIDDEN || "Operation Forbidden: eg supplied credentials do not provide sufficient privileges (on given slice)" || |
| 145 | || 4 || BADVERSION || "Bad Version (eg of RSpec)" || |
| 146 | || 5 || SERVERERROR || "Server Error" || |
| 147 | || 6 || TOOBIG || "Too Big (eg request RSpec)" || |
| 148 | || 7 || REFUSED || "Operation Refused" || |
| 149 | || 8 || TIMEDOUT || "Operation Timed Out" || |
| 150 | || 9 || DBERROR || "Database Error" || |
| 151 | || 10 || RPCERROR || "RPC Error" || |
| 152 | || 11 || UNAVAILABLE || "Unavailable (eg server in lockdown)" || |
| 153 | || 12 || SEARCHFAILED || "Search Failed (eg for slice)" || |
| 154 | || 13 || UNSUPPORTED || "Operation Unsupported" || |
| 155 | || 14 || BUSY || "Busy (resource, slice); try again later" || |
| 156 | || 15 || EXPIRED || "Expired (eg slice)" || |
| 157 | || 16 || INPROGRESS || "In Progress" || |
| 158 | || 17 || ALREADYEXISTS || "Already Exists (eg the slice}" || |
| 159 | |
| 160 | Aggregates are similarly encouraged to provide hints on how to fix bad requests using the {{{value}}} entry to experimenters on error or failures. For example, a failed !RenewSliver call that failed because you are not allowed to renew your sliver that far in the future, might return a new date string in the {{{value}}} field that would be allowed. Similarly, a failed !CreateSliver call might return a modified request RSpec in the {{{value}}} field. |
| 161 | |
| 162 | Note that a malformed XML-RPC request should still raise an XML-RPC Fault, and other Faults dictated by the XML-RPC specification should still be raised. Aggregates should avoid raising an error (XML-RPC Fault) for application layer errors or any other cases where the XML-RPC specification does not require a Fault, but rather should attempt to return this struct, providing any error messages and stack traces in the {{{output}}} field or other additional fields. Certain XML-RPC errors may be returned using Faults or otherwise by the XML-RPC layer, or may more properly be returned using this struct in the application layer. In such cases, servers should use error codes with negative values. Selected such errors are listed below: |
| 163 | |
| 164 | || -32001 || SERVERBUSY || "Server is (temporarily) too busy; try again later" || |
| 165 | |
| 166 | Note also that servers may respond with other HTTP error codes, and clients must be prepared to deal with those situations. Specifically, a server that is busy might return HTTP code 503, or just refuse the connection. |
| 167 | |
| 168 | ----- |
| 169 | === Documenting Aggregate Additions === |
| 170 | |
| 171 | Aggregates are free to add additional return values or input {{{options}}} to support aggregate or resource specific functionality, or to innovate within the bounds of the AM API. Aggregates are encouraged to document any such new return values which they return or {{{options}}} arguments, to bootstrap coordination with clients, and provide documentation for human experimenters. One way to provide partial documentation, is to implement [http://xmlrpc-c.sourceforge.net/introspection.html XML-RPC introspection]. Through the use of method help, aggregates can provide human readable text describing return values. Alternatively or additionally, aggregates may document return values as part of their return from !GetVersion. This API does not specify the format for advertising those extra return values in !GetVersion. |
| 172 | |
| 173 | ----- |
| 174 | === Supporting Multiple API Versions === |
| 175 | |
| 176 | Aggregates are free to support multiple versions of the AM API. They do so by providing different URLs for each version of the API that they support. Aggregates should have a 'default' URL (the one typically advertised). That version runs whichever version of the API the server chooses (could be the latest, could be something else.) |
| 177 | |
| 178 | When aggregates start supporting a new version of the API, they should keep running the old version of the API for a suitable transition period. |
| 179 | |
| 180 | Aggregates running multiple versions of the API must advertise the URLs and versions of the API supported using the new !GetVersion return as part of the {{{value}}} entry: |
| 181 | {{{ |
| 182 | geni_api_versions: an XML-RPC struct containing 1+ entries of: |
| 183 | Name: Integer - supported GENI AM API version |
| 184 | Value: String - URL to the XML-RPC server implementing that version of the GENI AM API |
| 185 | }}} |
| 186 | |
| 187 | For example |
| 188 | {{{ |
| 189 | geni_api_versions: { |
| 190 | 1: <URL>, |
| 191 | 2: <Local URL, as this is API version 2>, |
| 192 | ... |
| 193 | } |
| 194 | }}} |
| 195 | |
| 196 | The entries indicate versions of the API that are supported, and URLs are absolute URLs where that version of the API is supported. |
| 197 | |
| 198 | === Sliver Allocation States === |
| 199 | Many operations in this API create slivers or change the allocation status of slivers, and often return the current allocation status of each sliver. |
| 200 | Valid sliver allocation states are: |
| 201 | 1. `geni_unallocated` (alternatively called 'null'). The sliver does not exist. This is the small black circle in typical state diagrams. |
| 202 | 2. `geni_allocated` (alternatively called 'offered' or 'promised'). The sliver exists, defines particular resources, and is in a sliver. The aggregate has not (if possible) done any time consuming or expensive work to instantiate the resources, provision them, or make it difficult to revert the slice to the state prior to allocating this sliver. This state is what the aggregate is offering the experimenter. |
| 203 | 3. `geni_provisioned`. The aggregate has started instantiating resources, and otherwise making changes to resources and the slice to make the resources available to the experimenter. At this point, operational states are valid to specify further when the resources are available for experimenter use. |
| 204 | |
| 205 | `geni_allocated` represents resources that have been allocated to a slice without provisioning the resources. This represents a cheap and un-doable resource allocation. When a sliver is created and moved into state 2 (`geni_allocated`), the aggregate produces a manifest RSpec identifying which resources are included in the sliver. These resources are exclusively available to the containing sliver, but are not ready for use. In particular, allocating a sliver should be a cheap and quick operation, which the aggregate can readily un-do without impacting the state of slivers which are fully provisioned. For some aggregates, transitioning to this state may be a no-op. |
| 206 | |
| 207 | States 2 and 3 (`geni_allocated` and `geni_provisioned`) have aggregate and possibly resource specific timeouts. By convention the `geni_allocated` state timeout is typically short. The `geni_provisioned` state timeout is the sliver expiration. If the client does not transition the sliver from `geni_allocated` to `geni_provisioned` before the end of the `geni_allocated` state timeout, the sliver reverts to `geni_unallocated`. If the experimenter needs more time, the experimenter should be allowed to request a renewal of either timeout. Note that typically the sliver expiration time (timeout for state 3, `geni_provisioned`) will be notably longer than the timeout for state 2, `geni_allocated`. |
| 208 | |
| 209 | State 3, `geni_provisioned`, is the state of the sliver allocation after the aggregate begins to instantiate the sliver. Note that fully provisioning a sliver may take noticeable time. This state also includes a timeout - the sliver expiration time (which is not necessarily related to the time it takes to provision a resource). Renew extends this timeout. For some aggregates and resource types, moving to this state from state 2 (`geni_allocated`) may be a no-op. |
| 210 | |
| 211 | If the transition from one state to another fails, the sliver shall remain in its original state. |
| 212 | |
| 213 | 1. Allocate moves 1+ slivers from `geni_unallocated` (state 1) to `geni_allocated` (state 2). This method can be described as creating an instance of the state machine for each sliver. If the aggregate cannot fully satisfy the request, the whole request fails. This is a change from the version 2 !CreateSliver, which also provisioned the resources, and 'started' them. That is Allocate does 1 of the 3 things that !CreateSliver did previously. |
| 214 | 2. Delete moves 1+ slivers from either state 2 or 3 (`geni_allocated` or `geni_provisioned`), back to state 1 (`geni_unallocated`). This is similar to the AM API version 2 !DeleteSliver. |
| 215 | 3. Renew, when given allocated slivers, requests an extended timeout for slivers in state 2 (`geni_allocated`). |
| 216 | 4. Renew can also be used to request an extended timeout for slivers in state 3 - the `geni_provisioned` state. That is, this method's semantics can be the same as !RenewSliver from AM API v2. |
| 217 | 5. Provision moves 1+ slivers from state 2 (`geni_allocated`) to state 3 (`geni_provisioned`). This is some of what version 2 !CreateSliver did. Note however that this does not 'start' the resources, or otherwise change their operational state. This method only fully instantiates the resources in the slice. This may be a no-op for some aggregates or resources. |
| 218 | |
| 219 | These states apply to each sliver individually. Logically, the state transition methods then take a single sliver URN. For convenience, these methods accept a list of sliver URNs, or a slice URN as a simple alias for all slivers in this slice at this aggregate. |
| 220 | |
| 221 | '''FIXME''': Add picture |
| 222 | |
| 223 | === Sliver Operational States === |
| 224 | The AM API defines a few operational states with particular semantics. AMs are not required to support them for a given set of resources, but if they use them, they must follow the given semantics. AMs are however STRONGLY encouraged to support them, to provide maximum utility. |
| 225 | |
| 226 | AMs may have their own operational states/state-machine internally. AMs are required to advertise such states and actions that experimenters may see or use, by using Ad RSpec extensions. Operational states which the experimenter never sees, need not be advertised. Operational states and actions are generally by resource type. The standard RSpec extension attaches such definitions to the `sliver_type` element of RSpecs. |
| 227 | |
| 228 | States should be defined in terms of (a) whether the resource is accessible to the experimenter on the data or control planes, (b) whether an experimenter action is required to change from this state, |
| 229 | and if so, (c) what action or actions are useful. If the resource will change states without explicit experimenter action, what is the expected next state on success. |
| 230 | |
| 231 | Note that states represent the AM's view of the operational condition of the resource. This state represents what the AM has done or learned about the resource, but experimenter actions may cause failures that the AM does not know about. |
| 232 | |
| 233 | There is no `busy` state. Instead, AMs are encouraged to define separate such transition states for each separate transition path, allowing experimenters to distinguish the start and end states for this transition. |
| 234 | |
| 235 | `shutdown` is not an operational state for a sliver. The Shutdown() API method applies to an entire slice. |
| 236 | |
| 237 | GENI defined operational states: |
| 238 | - `geni_notready`: A final state. The resource is not usable / accessible by the experimenter, and requires explicit experimenter action before it is usable/accessible by the experimenter. For some resources, `geni_start` will move the resource out of this state and towards `geni_ready`. |
| 239 | - `geni_configuring`: A wait state. The resource is in process of changing to `geni_ready`, and on success will do so without additional experimenter action. For example, the resource may be powering on. |
| 240 | - `geni_stopping`: A wait state. The resource is in process of changing to `geni_notready`, and on success will do so without additional experimenter action. For example, the resource may be powering off. |
| 241 | - `geni_ready`: A final state. The resource is usable/accessible by the experimenter, and ready for slice operations. |
| 242 | - `geni_ready_busy`: A wait state. The resource is performing some operational action, but remains accessible/usable by the experimenter. Upon completion of the action, the resource will return to `geni_ready`. |
| 243 | - `geni_failed`: A final state. Some operational action failed, rendering the resource unusable. An administrator action, undefined by this API, may be required to return the resource to another operational state. |
| 244 | |
| 245 | === Sliver Operational Actions === |
| 246 | The API defines a few operational actions: these need not be supported. AMs are encouraged to support these if possible, but only if they can be supported following the defined semantics. |
| 247 | |
| 248 | AMs may have their own operational states/state-machine internally. AMs are required to advertise such states and actions that experimenters may see or use, by using Ad RSpec extensions. Operational states which the experimenter never sees, need not be advertised. Operational states and actions are generally by resource type. The standard RSpec extension attaches such definitions to the `sliver_type` element of RSpecs. |
| 249 | |
| 250 | Tools must use the operational states and actions advertisement to determine what operational actions to offer to experimenters, and what actions to perform for the experimenter. Tools may choose to offer actions which the tool does not understand, relying on the experimenter to understand the meaning of the new action. |
| 251 | |
| 252 | Any operational action may fail. When this happens, the API method should return an error code. The sliver may remain in the original state. In some cases, the sliver may transition to the `geni_failed` state. |
| 253 | |
| 254 | GENI defined operational actions: |
| 255 | - `geni_start`: This action results in the sliver becoming `geni_ready` eventually. The operation may fail (move to `geni_failed`), or move through some number of transition states. See EG booting a VM. |
| 256 | - `geni_restart`: This action results in the sliver becoming `geni_ready` eventually. The operation may fail (move to `geni_failed`), or move through some number of transition states. During this operation, the resource may or may not remain accessible. Dynamic state associated with this resource may be lost by performing this operation. See EG re-booting a VM. |
| 257 | - `geni_stop`: This action results in the sliver becoming `geni_notready` eventually. The operation may fail (move to `geni_failed`), or move through some number of transition states. See EG powering down a VM. |
| 258 | |
| 259 | ---- |