65 | | ----- |
66 | | |
67 | | = Change Set C: !UpdateSlivers = |
68 | | Add an ability for experimenters to modify their allocated resources at an aggregate without deleting (and possibly losing) existing resource allocations. |
69 | | |
70 | | This change was briefly discussed at GEC13, and is a topic for ongoing discussion. |
71 | | |
72 | | == Motivation == |
73 | | A common complaint among experimenters about the current AM API is that there is no way to add or remove resources from a slice at an aggregate, without completely deleting the slice at that aggregate and recreating it, possibly losing resources to another experimenter and certainly losing state. This proposal aims to address that, by introducing a method to update the slice at an aggregate. |
74 | | |
75 | | The [http://svn.planet-lab.org/attachment/wiki/WikiStart/sfa.pdf SFA] calls for an !UpdateSlice method, "to request that additional resources—as specified in the RSpec—be allocated to the slice". |
76 | | |
77 | | In the !PlanetLab implementation of the SFA, !UpdateSliver is in fact a synonym for !CreateSliver - the server will ensure that your allocated resources match your request RSpec, adding, removing and modifying resources as needed. It immediately allocates and boots nodes to match the request RSpec. |
78 | | |
79 | | The ProtoGENI CMV2 API has [http://www.protogeni.net/trac/protogeni/wiki/ComponentManagerAPIV2#UpdateSliver UpdateSliver], which is described as the way to "Request a change of resuorces for an existing sliver. The new set of resources that are desired are specified in the rspec." At ProtoGENI as at !PlanetLab, this method takes the full RSpec description of resources the experimenter wants, and the server computes the difference with what the experimenter already has. At ProtoGENI though, this method returns a ticket. The experimenter must then redeem the ticket to actually acquire the resources. |
80 | | |
81 | | This topic was discussed at the [http://groups.geni.net/geni/wiki/GEC12GeniAmAPI GEC12 AM API session] and on the GENI dev mailing list (in [http://lists.geni.net/pipermail/dev/2011-October/000433.html October] and [http://lists.geni.net/pipermail/dev/2011-November/000531.html November]). |
82 | | |
83 | | == !UpdateSlivers == |
84 | | This change would add a new method !UpdateSlivers, which takes a full request RSpec of the desired final state of the slice at this aggregate. This proposal calls for adding this functionality in the context of tickets: the !UpdateSlivers method returns a ticket. See [#ChangeSetE:Tickets Change Set E: Tickets] for details. If tickets are not adopted, consider the alternative proposal below, where !UpdateSlivers immediately allocates and starts the requested resources, as in !CreateSlivers. |
85 | | |
86 | | Some points about this change: |
87 | | - The method takes a full request RSpec - not a diff. |
88 | | - Note that we want the manifest to be readily modifiable to be a request (include component_ids and sliver_ids), but it is not yet. |
89 | | - AMs may, as always, return {{{UNSUPPORTED}}} - EG if they are incapable of determining what changes to apply (computing a diff). |
90 | | - The request is either fully satisfied, or fails (returns an error code). |
91 | | - AMs must document the level of service they provide: will any state be lost on existing resources? |
92 | | - Typically this would be a per node or resource-type specification. |
93 | | - Use the levels of guarantee in {{{geni_state_guarantee}}} below. |
94 | | - Default is to provide no guarantee. |
95 | | - This API does not define where AMs provide detailed documentation, but AMs must return this value for the entire change as part of the return struct. |
96 | | - Experimenters may specify what level of disruption they can tolerate, using the {{{geni_state_guarantee}}} below. AMs are expected to fail a request with a specified service guarantee that they cannot satisfy. Default is to request no guarantee. |
97 | | - Options includes {{{geni_end_time}}}, a RFC3339 requested end time for the reservation. See below for details. |
98 | | - If omitted, the AM may reserve the resources for the default sliver duration. |
99 | | - AMs should follow the logic of !RenewSlivers to determine if the requested duration of the sliver is acceptable. |
100 | | - The request should Fail (return an error code) if the resources cannot be reserved until the requested time. |
101 | | |
102 | | This change adds a new option {{{geni_state_guarantee}}} with these possible values (case insensitive string or integer): |
103 | | - 0=NO_GUARANTEE (default: all state ''may'' be lost) |
104 | | - 1=SAVE_DISK (disk state will be preserved but running processes will be lost) |
105 | | - 2=SAVE_DISK_AND_PROCESSES (both disk state and running processes will be preserved, like migrating a VM) |
106 | | - 3=NO_DISRUPTION (no noticeable service disruption) |
107 | | |
108 | | AMs which cannot meet the implied limit to service disruption should fail the request (return an error code). |
109 | | |
110 | | ''However'' there is an alternative more widely accepted proposal, to use an RSpec extension allowing this guarantee to be per-resource: |
111 | | EG: |
112 | | {{{ |
113 | | <node ... sliver_id="urn:publicid:IDN+jonlab.tbres.emulab.net+sliver+250"> |
114 | | <sliver_type name="raw-pc"/> |
115 | | ... |
116 | | <preserve:preserve guarantee="persistent-state" /> |
117 | | </node> |
118 | | }}} |
119 | | |
120 | | This uses the RNC schema: |
121 | | {{{ |
122 | | default namespace = "http://www.protogeni.net/resources/rspec/ext/preserve/1" |
123 | | |
124 | | # This is meant to extend a node or link |
125 | | Preserved = element preserve { |
126 | | attribute guarantee { "none" | "persistent-state" | |
127 | | "dynamic-state" | "no-disruption" } |
128 | | } |
129 | | |
130 | | start = Preserved |
131 | | }}} |
132 | | |
133 | | In the above schema, the states represent increasing levels of state preservation guarantee. |
134 | | |
135 | | {{{ |
136 | | struct UpdateSlivers(string slice_urn, string credentials[], string rspec, |
137 | | struct options) |
138 | | }}} |
139 | | |
140 | | Returns a struct: |
141 | | {{{ |
142 | | { |
143 | | string ticket=<ticket> |
144 | | string geni_status=<sliver state - ticketed>, |
145 | | string geni_state_guarantee=<promise from AM of what experimenter state will be lost on trying to 'start' this allocation>, |
146 | | <others that are AM specific> |
147 | | } |
148 | | }}} |
149 | | |
150 | | See [#ChangeSetE:Tickets Change Set E: Tickets] for details on the ticket returned and ticket semantics. |
151 | | |
152 | | == Alternative proposal: !UpdateSlivers with immediate allocation == |
153 | | An alternative proposal would add two versions of !UpdateSlivers. Method 1 would return a ticket as above. Method 2 would immediately allocate the resources, as with !CreateSlivers. |
154 | | |
155 | | This proposed version of !UpdateSlivers is substantially the same as the main proposal above, with a few differences: |
156 | | |
157 | | - Under this alternative proposal, on success the new resources are allocated to the slice. As with !CreateSlivers, by default those new resources are initialized or booted or started, such that they will shortly become available for experimenter use. |
158 | | - As with !CreateSlivers, the AM should start/restart resources immediately, as necessary. |
159 | | - This change introduces a new option {{{geni_donotstart}}}. When supplied and true (boolean: 0 or 1 in XML-RPC), aggregates should allocate the resources but not start them. Experimenters will have to explicitly use !ActOnSlivers to start or restart resources as necessary. Note that there may be no such distinction for some resources. |
160 | | - Removed resources are stopped by the aggregate automatically. |
161 | | - Note that the tickets proposal includes a method that updates the resource reservation without allocating them. |
162 | | - This method moves the overall sliver state to {{{allocated}}}, and then (if the experimenter did not specify {{{geni_donotstart}}}) {{{configuring}}} and then {{{ready}}} if it succeeds. |
163 | | |
164 | | Proposed method signature: |
165 | | {{{ |
166 | | struct UpdateSlivers(string slice_urn, |
167 | | string credentials[], |
168 | | <GENI request RSpec schema compliant XML string> rspec, |
169 | | struct users[], |
170 | | struct options) |
171 | | Return value |
172 | | { |
173 | | string rspec=<manifest>, |
174 | | string geni_start_time=<optional (may be omitted altogether) RFC3339 start time for the allocation: now if not specified>, |
175 | | string geni_expires=<RFC3339 sliver expiration>, |
176 | | string geni_status=<sliver state - allocated or changing or ready>, |
177 | | string geni_state_guarantee=<promise from AM of what experimenter state will be lost on trying to 'start' this allocation>, |
178 | | <others that are AM specific> |
179 | | } |
180 | | }}} |
| 78 | ---- |
219 | | - AMs should return an error message. Clients may often use !UpdateSlivers instead to similar effect. |
220 | | |
221 | | = Change Set E: Tickets = |
222 | | AM APIv3 adds support for negotiated reservations or two-phase commit, by '''add'''ing methods that allow an experimenter to reserve resources for a slice without committing to using them, or forcing the AM to incur the cost of instantiating them. |
223 | | |
224 | | This change is actively under discussion. |
225 | | |
226 | | For an alternative proposal, see: http://www.protogeni.net/trac/protogeni/wiki/AM_API_proposals |
227 | | |
228 | | == Motivation == |
229 | | This possible change was discussed at the [http://groups.geni.net/geni/wiki/GEC12GeniAmAPI GEC12 AM API session]. |
230 | | |
231 | | The [http://svn.planet-lab.org/attachment/wiki/WikiStart/sfa.pdf SFA] defines the concept of a ticket. SFA1.0 section 5.3 says "A component signs an RSpec to produce a ticket, indicating a promise by the component to bind resources to the ticket-holder at some point in time." Tickets are promises to allocate resources. |
232 | | |
233 | | Tickets are used in the [http://www.protogeni.net/trac/protogeni/wiki/ComponentManagerAPIV2 ProtoGENI CMV2 interface], and are discussed on [http://www.protogeni.net/trac/protogeni/wiki/Tickets the PG wiki]. Tickets with a slightly different semantics (and leases) are also used extensively in Orca. For details on the use of leases and tickets in Orca, see [https://geni-orca.renci.org/trac/attachment/wiki/WikiStart/ORCA%20Book.pdf 'the Orca Book']. However, each of these uses of the notion of tickets differs. |
234 | | |
235 | | Tickets would potentially enable a number of useful and possibly critical features: |
236 | | - Coordinated or negotiated reservations: reserving resources from aggregate B only if aggregate A can give you a complementary resource. For example, a VLAN tag. This is related to stitching, both network stitching and the more general form. |
237 | | - Two phase commit reservations (similar to the above). |
238 | | - Scheduled reservations in the future. |
239 | | - Brokers: 3rd parties consolidating, scheduling and allocating resources on behalf of a number of other aggregates |
240 | | - Lending resources to other experimenters |
241 | | - Giving experimenters explicit control over when resources are started, stopped, and restarted (see the discussion on !UpdateSliver). |
242 | | |
243 | | == Tickets semantics == |
244 | | This proposal would add tickets to the existing AM API, allowing experimenters to reserve (hold) resources temporarily and cheaply. Tickets represent a promise to the named slice to allocate the specified resources, if the ticket is 'redeemed' while the ticket is still valid. Tickets describe a complete specification of the resources that will be allocated to the slice at this aggregate, if the ticket is redeemed. |
245 | | |
246 | | Some key properties of the tickets proposed here: |
247 | | - Tickets are IOUs from an AM to a slice (not to an experimenter - no delegation is necessary or possible). |
248 | | - Experimenters do not need to use tickets to reserve resources: existing methods without tickets still work. |
249 | | - A ticket is a promise by the AM to give the specified resources to the slice if an authorized slice member requests them. |
250 | | - The aggregate is saying they will not give away these resources to other slices, but only to this slice. |
251 | | - AMs must document how firm their promises are. See the attribute {{{geni_ticket_service_class}}}. |
252 | | - Some aggregates may only offer soft promises, as in PlanetLab. |
253 | | - Tickets are signed by the AM: non repudiatable. |
254 | | - Tickets are bound to a slice: they contain the slice certificate. |
255 | | - Tickets may be passed from 1 researcher on a slice to another freely - no explicit delegation is required. |
256 | | - Indeed, any experimenter with appropriate slice credentials can retrieve the ticket from the aggregate. |
257 | | - Tickets may not be delegated to another slice or other entity; these tickets do not support brokers. |
258 | | - Tickets promise a particular set of resources: they include an RSpec. Note that this may be an unbound RSpec. |
259 | | - Note that we do not currently have unbound manifest RSpecs. For now we specify only that this is an RSpec. |
260 | | - Tickets are good for a limited time. |
261 | | - They must be redeemed by a specified time, {{{redeem_before}}}, after which the aggregate is free to assign the resources elsewhere. |
262 | | - Aggregates determine {{{redeem_before}}}, which is some epsilon in the near future. |
263 | | - Aggregates may accept a new option {{{geni_reserve_until}}} which is a request for a particular {{{redeem_before}}}, but are not required to support this (they may ignore the option). |
264 | | - Tickets specify when the resources will be available from ({{{starts}}}, typically essentially now), and when they will be available until (typically now plus the aggregate-local default sliver expiration time). |
265 | | - The resources may be available even longer, but that would require a separate !RenewSlivers call. |
266 | | - Tickets specify the full final state of the slice after applying this ticket. |
267 | | - Tickets are not incremental changes, and are not additive. |
268 | | - The implication is that there may be only 1 ticket outstanding for a slice per aggregate (except for scheduled reservations, see below). |
269 | | - This also implies that these tickets are not suitable for use by brokers. |
270 | | - Aggregates must attempt to honor their promises. As a result, aggregates must remember all outstanding tickets until they are redeemed or expire. |
271 | | - All ticket related timestamps must be in the format of RFC3339 (http://www.ietf.org/rfc/rfc3339.txt) |
272 | | - Full date and time with explicit timezone: offset from UTC or in UTC. |
273 | | - eg: {{{1985-04-12T23:20:50.52Z}}} or {{{1996-12-19T16:39:57-08:00}}} |
274 | | |
275 | | == Ticket contents == |
276 | | Tickets have an ID, the certificate of the slice to whom the resources are promised, an RSpec representing the promised resources, several timestamps, other attributes, and a signature by the aggregate (including the aggregate's certificate). |
277 | | |
278 | | Tickets are externally represented as signed XML documents following the [http://www.w3.org/TR/xmldsig-core/ XML Digital Signatures specification]. |
279 | | |
280 | | Tickets contain: |
281 | | - {{{owner_gid}}} = the certificate of the experimenter who requested the ticket |
282 | | - {{{target_gid}}} = slice certificate |
283 | | - {{{uuid}}} |
284 | | - Unique ID for the ticket, in the hexadecimal digit string format given in [http://www.ietf.org/rfc/rfc4122.txt RFC 4122] |
285 | | - {{{expires}}} - RFC3339 compliant Date/Time when the resources will no longer be yours per this reservation (eg sliver duration+now) |
286 | | - {{{redeem_before}}}: RFC3339 compliant Date/Time when you must redeem this reservation, or your resources will be returned to the available pool (eg now+epsilon) |
287 | | - {{{starts}}} - RFC3339 compliant Date/Time when the resources will be yours, per this reservation (eg now) |
288 | | - RSpec (not specified as request or manifest) |
289 | | - Attributes (AM/resource-type specific name/value pairs) |
290 | | - Including optionally {{{geni_state_guarantee}}} as defined below, to indicate if existing slivers will be disrupted (default is no guarantee). |
291 | | - Including {{{geni_ticket_service_class}}} as defined below, to indicate the firmness of the promise this ticket represents |
292 | | - signature including issuing AM's certificate |
293 | | |
294 | | More formally: |
295 | | {{{ |
296 | | { |
297 | | owner_gid = <the certificate of the experimenter who requested the ticket>, |
298 | | target_gid = <slice certificate, following GENI AM API certificate specification>, |
299 | | uuid = <RFC 4122 compliant string>, |
300 | | expires = <RFC3339 compliant Date/Time when the resources will no longer be yours per this reservation (eg sliver duration+now)>, |
301 | | redeem_before = <RFC3339 compliant Date/Time when you must redeem this reservation, or your resources will be returned to the available pool (eg now+epsilon)>, |
302 | | starts = <RFC3339 compliant Date/Time when the resources will be yours, per this reservation (eg now)>, |
303 | | rspec = <RSpec (not specified as request or manifest)>, |
304 | | attributes = { |
305 | | geni_state_guarantee = <string>, |
306 | | geni_ticket_service_class = <string>, |
307 | | <others> |
308 | | }, |
309 | | signature |
310 | | } |
311 | | }}} |
312 | | |
313 | | Tickets may include in the {{{attributes}}} element the attribute {{{geni_state_guarantee}}}, indicating whether the AM will preserve the state of any existing resources (case insensitive string or integer). |
314 | | - 0=NO_GUARANTEE (Default: all state ''may'' be lost) |
315 | | - 1=SAVE_DISK (disk state will be preserved but running processes will be lost) |
316 | | - 2=SAVE_DISK_AND_PROCESSES (both disk state and running processes will be preserved, like migrating a VM) |
317 | | - 3=NO_DISRUPTION (no noticeable service disruption) |
318 | | |
319 | | ''However'', see above for a proposed RSpec extension to instead make the state guarantee per resource. |
320 | | |
321 | | Tickets should include the {{{geni_ticket_service_class}}} attribute for advertising the firmness of the promise that a ticket represents (case insensitive string or integer). |
322 | | - FIXME: Provide definitions for these service classes. |
323 | | - 1=WEAK_EFFORT |
324 | | - 2=BEST_EFFORT |
325 | | - 3=ELASTIC_RESERVATION |
326 | | - 4=HARD_RESERVATION |
327 | | |
328 | | Tickets will follow a defined schema, to be published on geni.net. |
329 | | |
330 | | Tickets logically have a URN (not included in the ticket): {{{urn:publicid:IDN+<AM name>+ticket+<uuid>}}} |
331 | | |
332 | | For a similar structure in ProtoGENI, see https://www.protogeni.net/trac/protogeni/attachment/wiki/Authentication/credential.rnc |
333 | | |
334 | | == Methods == |
335 | | 1. !GetTicket |
336 | | {{{ |
337 | | struct GetTicket (string slice_urn, string credentials[], string requestRSpec, |
338 | | struct options) |
339 | | }}} |
340 | | - Get a ticket promising resources requested in the rspec. |
341 | | - If there is already an outstanding ticket for the slice, an error is returned. |
342 | | - Return: ticket |
343 | | - Result State: {{{ticketed}}} |
344 | | - Options may include {{{geni_start_time}}} and {{{geni_end_time}}} (see below) |
345 | | 2. !RedeemTicket |
346 | | {{{ |
347 | | struct RedeemTicket(string slice_urn, string credentials[], string ticket, |
348 | | struct users[] (as in CreateSlivers), struct options) |
349 | | }}} |
350 | | - Return: |
351 | | {{{ |
352 | | { |
353 | | string rspec=<manifest>, |
354 | | string geni_start_time=<optional (may be omitted altogether): now if not specified>, |
355 | | string geni_expires=<RFC3339 sliver expiration>, |
356 | | string geni_status=<sliver state - allocated (or optionally changing or ready)>, |
357 | | string geni_state_guarantee=<promise from AM of what experimenter state will be lost on trying to 'start' this allocation>, |
358 | | <others that are AM specific> |
359 | | } |
360 | | }}} |
361 | | - Begin allocating the resources promised in the ticket. |
362 | | - Option {{{geni_auto_start}}}: |
363 | | - If supplied and true (boolean: 0 or 1 in XML-RPC), the aggregate automatically start/restarts resources as necessary, as though the experimenter called !ActOnSlivers(start). |
364 | | - State will be {{{changing}}} and then {{{ready}}} |
365 | | - If omitted the aggregate does not start resources (default behavior). The final state is {{{allocated}}}, and the experimenter must explicitly start or restart resources using !ActOnSlivers |
366 | | - Note that resources which do not require a 'start' may already be {{{ready}}} on normal return from !RedeemTicket. |
367 | | - Omitting the ticket causes the aggregate to redeem the outstanding ticket for this slice if any. If none, return an error code. |
368 | | - The ticket must be valid: not expired or previously redeemed or replaced. If so, an error is returned. |
369 | | |
370 | | 3. !ReleaseTicket |
371 | | {{{ |
372 | | struct ReleaseTicket(string slice_urn, string credentials[], string ticket, struct options) |
373 | | }}} |
374 | | - Give up the reservation for resources. |
375 | | - Return: True or error |
376 | | - Omitting the ticket causes the aggregate to release the 0 or 1 outstanding tickets for this slice. |
377 | | - If this ticket was from !UpdateSlivers, then the sliver returns to the {{{allocated}}} state and existing resources are not modified. |
378 | | |
379 | | 4. !UpdateTicket |
380 | | (atomic release/get) |
381 | | {{{ |
382 | | struct UpdateTicket(string slice_urn, string credentials[], string requestRSpec, |
383 | | string ticket, struct options) |
384 | | }}} |
385 | | - For updating a reservation in place, replacing one ticket with a new one. On success, the old ticket is invalid. |
386 | | - Return: Ticket |
387 | | - Result State: {{{ticketed}}} |
388 | | - Options may include {{{geni_start_time}}} and {{{geni_end_time}}} (see below) |
389 | | - The ticket must be valid: not expired or previously redeemed or replaced. If so, an error is returned. |
390 | | |
391 | | 5. !UpdateSlivers |
392 | | {{{ |
393 | | struct UpdateSlivers(string slice_urn, string credentials[], string requestRSpec, |
394 | | struct options) |
395 | | }}} |
396 | | - Returns a struct: |
397 | | {{{ |
398 | | { |
399 | | string ticket=<ticket> |
400 | | string geni_status=<sliver state - ticketed>, |
401 | | string geni_state_guarantee=<promise from AM of what experimenter state will be lost on trying to 'start' this allocation>, |
402 | | <others that are AM specific> |
403 | | } |
404 | | }}} |
405 | | - Get a promise for resources that would replace currently allocated resources, as defined in [#ChangeSetC:UpdateSlivers Change Set C]. |
406 | | - Result State: {{{ticketed}}} |
407 | | - On completion, the slice has both a ticket and a set of slivers at this aggregate. Overall it is both {{{allocated}}} and {{{ticketed}}}, which is called {{{ticketed}}}. |
408 | | - Options may include {{{geni_start_time}}} and {{{geni_end_time}}}, a RFC3339 requested start and end time for the reservation (option not required). |
409 | | - The request should Fail (return an error code) if the resources cannot be reserved from or until the requested time. |
410 | | - The method takes a full request RSpec - not a diff. |
411 | | - AMs may, as always, return {{{UNSUPPORTED}}} - EG if they are incapable of determining what changes to apply (computing a diff). |
412 | | - The request is either fully satisfied, or fails (returns an error code). |
413 | | - AMs must document the level of service they provide using levels from {{{geni_state_guarantee}}}: will any state be lost on existing resources? |
414 | | - Default is to provide no guarantee. |
415 | | - Experimenters may specify what level of disruption they can tolerate, using the {{{geni_state_guarantee}}} option. |
416 | | - AMs are expected to fail a request with a specified service guarantee that they cannot satisfy. Default is to request no guarantee. |
417 | | - For further details on the !UpdateSlivers semantics, see [#ChangeSetC:UpdateSlivers Change Set C]. |
418 | | |
419 | | For a similar set of functions in ProtoGENI, see: https://www.protogeni.net/trac/protogeni/wiki/ComponentManagerAPIV2 |
420 | | |
421 | | == Other changes to support tickets == |
422 | | - !CreateSlivers remains the first call: do not use it to add resources to the slice. |
423 | | - !ListResources return value changes to be: |
424 | | {{{ |
425 | | { |
426 | | string rspec (ad or Manifest - may be empty though) |
427 | | string tickets[] (required but may be an empty list) |
428 | | } |
429 | | }}} |
430 | | - For !ListResources with no {{{slice_urn}}}, {{{tickets}}} shall be an empty list, and rspec shall be an ad RSpec. |
431 | | - For !ListResources with a {{{slice_urn}}}, {{{rspec}}} is the manifest RSpec for everything belonging to that slice at this AM, if anything is currently allocated (not just a ticket). {{{tickets}}} is then any outstanding ticket(s) for this slice. |
432 | | |
433 | | == Scheduling support using Tickets == |
434 | | This ticket structure and methods, with small additions, supports using tickets for scheduling. This proposal does not require support for scheduling at aggregates. |
435 | | |
436 | | - We are not explicitly supporting scheduling, but the timestamps here should be sufficient. |
437 | | - !GetTicket, !CreateSlivers, !ListResources, !UpdateTicket, !UpdateSlivers all accept new RFC3339 compliant {{{geni_start_time}}} and {{{geni_end_time}}} options to support scheduling in the future. |
438 | | - For !GetTicket and !CreateSlivers, if left out then the reservation start is 'now or really soon' and the end is start plus the default sliver duration. |
439 | | - AMs that do not support scheduling return {{{UNSUPPORTED}}} when passed {{{geni_start_time}}}. |
440 | | - AMs should still support {{{geni_end_time}}}, following the logic of !RenewSlivers to determine if the requested duration of the sliver is acceptable. |
441 | | - IE at !CreateSlivers and !GetTicket and !UpdateSlivers in particular |
442 | | - The request should Fail (return an error code) if the resources cannot be reserved until the requested time. |
443 | | - {{{redeem_before}}} in tickets should be {{{starts}}}+epsilon. That epsilon is AM specific, but typically a small number of minutes. |
444 | | - Multiple tickets may be outstanding for a single slice at a single AM only for non overlapping time intervals. |
445 | | - IE you could request 2 tickets: 1 for machines 1-3 on Tuesday and simultaneously request 1 for machines 4-6 on Thursday. |
446 | | - These options are accepted in !ListResources as well. |
447 | | - Specifying {{{geni_start_time}}} means tell me what will be available at that time. Default is now. |
448 | | - Specifying both {{{geni_end_time}}} and {{{geni_start_time}}} means show me only things available for that entire duration. |
449 | | |
450 | | == An Alternative: Provide two !UpdateSlivers methods == |
451 | | One alternative would be to define two versions of !UpdateSlivers, with and without an intermediate ticket. The no-ticket version of this method would behave like !CreateSlivers, immediately allocating requested resources. For details on this proposal, see above. |
452 | | |
453 | | ----- |
| 119 | - AMs should return an error message if the operation is not supported. |
| 120 | - See below for ways that aggregates advertise their supported behavior. |
| 121 | |
| 122 | 3. Define new returns from !GetVersion, for specifying the semantics of operating on individual slivers. |
| 123 | These returns are only required if the aggregate supports non-standard behavior. Aggregates that supporte the default behavior may omit these !GetVersion returns. |
| 124 | |
| 125 | - `geni_single_allocation`: <XML-RPC boolean 1/0, default 0>: When performing one of (Describe, Allocate, Renew, Provision, Delete), the AM requires you to include either the slice urn or the urn of all the slivers in the same state. If you attempt to run one of those operations on just some slivers in a given state, the AM will return an error. |
| 126 | For example, you must Provision all `geni_allocated` slivers at once: At an aggregate with `geni_single_allocation` true, if you supply a list of sliver URNs to Provision that is only 'some' of the `geni_allocated` slivers for this slice at this AM, then the AM will return an error. |
| 127 | Similarly, such an aggregate would return an error from Describe if you request a set of sliver URNs that is only some of the `geni_provisioned` slivers. |
| 128 | |
| 129 | - `geni_allocate`: A string, one of fixed set of possible values. Default is `geni_single`. This option defines whether this AM allows adding slivers to slices at an AM (i.e. calling Allocate() multiple times, without first deleting the allocated slivers). Possible values: |
| 130 | - `geni_single`: Performing multiple Allocates without a delete is an error condition because the aggregate only supports a single sliver per slice or does not allow incrementally adding new slivers. This is the AM API v2 behavior. |
| 131 | - `geni_disjoint`: Additional calls to Allocate must be disjoint from slivers allocated with previous calls (no references or dependencies on existing slivers). The topologies must be disjoint in that there can be no connection or other reference from one topology to the other. |
| 132 | - `geni_many`: Multiple slivers can exist and be incrementally added, including those which connect or overlap in some way. New aggregates should strive for this capability. |
| 133 | |
| 134 | Note that these options interact with `geni_best_effort` defined in Change Set F3, defining whether operations on a set of slivers (or whole slice) should either all fail/succeed together, or if some slivers can succeed and others fail. Default behavior is false - all slivers succeed or all fail. |
| 135 | |
| 136 | It is expected that many aggregates will implement one of the following combinations of options: |
| 137 | - `geni_best_effort` = true, `geni_allocate` = `geni_many`, `geni_single_allocation` = false (E.G. FOAM, !PlanetLab) |
| 138 | - `geni_best_effort` = false, `geni_allocate` = `geni_disjoint`, `geni_single_allocation` = true (E.G. ProtoGENI) |
| 139 | |
464 | | == Change Set F1: Define Sliver States == |
465 | | Currently the AM API defines several possible states as valid returns in !SliversStatus: {{{configuring}}}, {{{ready}}}, {{{unknown}}}, and {{{failed}}}. This change changes and expands that list of valid states, and explicitly defines the expected states after each AM API method call. Additionally, this change provides a mechanism for aggregates to supply their own states. |
466 | | |
467 | | The GENI AM API can be thought of as manipulating slivers. As such, each method potentially changes the state of 1 or more slivers. With the changes proposed here, several of the methods return a new {{{geni_status}}} field, whose value is one of the standard GENI sliver status values. Aggregates must use one of the standard GENI values for that return. |
468 | | |
469 | | {{{geni_status}}} legal values (case insensitive): |
470 | | - {{{uninitialized}}}: This is the state before any AM-local operation for this slice. |
471 | | - {{{ticketed}}}: The resources are reserved for the slice, but not currently provisioned for the slice. Slivers are {{{ticketed}}} after !GetTicket, !UpdateTicket, or after !UpdateSlivers. Note in particular that a slice may have some resources that are {{{ready}}} and others which are {{{ticketed}}} after an !UpdateSlivers call: we call the whole slice {{{ticketed}}} in this case. |
472 | | - {{{allocated}}}: The sliver(s) are currently provisioned for the slice, but not necessarily fully ready for experimental use (eg, not booted). This is the state after !RedeemTicket, or after !CreateSlivers with the {{{geni_donotstart}}} option. |
473 | | - {{{ready}}}: The resources are ready for experimental use, as in after !CreateSlivers completes any booting or starting. Similarly after !ActOnSlivers with the {{{start}}} command. Note that each of those methods starts a process that may take significant time to complete. During that time the sliver will not yet be {{{ready}}}. |
474 | | - {{{closed}}}: When the slice was previously provisioned resources, which have now expired or been de-allocated with !DeleteSlivers, we call the sliver {{{closed}}}. Note that this state is rarely seen in practice - aggregates do not respond in this API to queries about slices that do not currently have outstanding allocations or tickets. |
475 | | - {{{changing}}}: This is the state of a sliver in transition. For example, while a machine is booting (changing from {{{allocated}}} to {{{ready}}}). This state used to be known as {{{configuring}}}. |
476 | | - {{{shutdown}}}: This is the state of a sliver after the Shutdown operation - the sliver is still allocated to the slice, possibly still booted and configured for the slice, but is not available for experimental use. And administrator must intervene to recover or delete the slivers. |
477 | | - {{{failed}}}: When an operation fails leaving the sliver unusable and requiring administrative intervention, it will be marked {{{failed}}}. |
478 | | - {{{unknown}}}: If the aggregate does not know the state of a sliver, it will be marked {{{unknown}}}. This state may be transitive, or may require an admin to recover. |
479 | | |
480 | | As in previous versions of this API, the state of the full set of slivers in a slice at an aggregate is a roll-up of the states of each sliver. For each of {{{ticketed}}}, {{{allocated}}}, and {{{ready}}}, the set of slivers is only in that state if all individual slivers are in that state. If any sliver is {{{shutdown}}} or {{{failed}}} or {{{changing}}} (in order of decreasing precedence), then the set of slivers is in that state. If all slivers are {{{unknown}}} or {{{closed}}}, then the slice at this aggregate is {{{unknown}}} or {{{closed}}}. |
481 | | - If not all resources in the sliver/slice can be moved to the desired next state, then the call fails. |
482 | | - When moving from state 1 to 2, the slice is in state 1 until all slivers are in state 2 (EG moving from {{{ticketed}}}->{{{allocated}}}). |
483 | | |
484 | | Aggregates are free to ''also'' return an aggregate specific status - either in an AM-specifically-named entry, or in {{{am_specific_status}}}. Such values should be thought of as sub-states within the GENI state. For example, where the GENI state might be {{{changing}}}, the AM specific state might also be {{{imaging}}} or {{{booting}}}. Methods which accept a state (!ActOnSlivers) may accept either one of the {{{geni_status}}} values, or an aggregate specific value. Aggregates must document the meaning and use of aggregate specific status values. |
485 | | |
486 | | State changes by method: |
487 | | - !GetTicket: From {{{uninitialized}}} to {{{ticketed}}} |
488 | | - !UpdateTicket: From {{{ticketed}}} to {{{ticketed}}} |
489 | | - !ReleaseTicket: From {{{ticketed}}} to {{{uninitialized}}} (or {{{allocated}}} if this was an update) |
490 | | - !RedeemTicket: From {{{ticketed}}} to {{{allocated}}} |
491 | | - !CreateSlivers: From {{{uninitialized}}} to {{{allocated}}} |
492 | | - And then to {{{ready}}} via {{{changing}}} if the {{{geni_donotstart}}} option is not supplied |
493 | | - !UpdateSlivers: From {{{allocated}}} or {{{ready}}} to {{{ticketed}}} |
494 | | - !DeleteSlivers: From {{{ready}}} or {{{allocated}}} (or {{{changing}}}, etc) to {{{closed}}} (not {{{ticketed}}}) |
495 | | - Shutdown: From {{{allocated}}} or {{{ready}}} to {{{shutdown}}} |
496 | | Note that {{{changing}}} or {{{unknown}}} may be a source state for any of these methods. Operations may fail, leaving a sliver {{{failed}}}, and operations may take time leaving a sliver {{{changing}}} for some time. |
497 | | |
498 | | Methods and state transitions as a picture: |
499 | | |
500 | | [[Image(sliver-states.jpg)]] |
501 | | |
502 | | {{{ |
503 | | #!comment |
504 | | - {{{uninitialized}}} -> (!GetTicket) -> {{{ticketed}}} (you have a ticket) |
505 | | - and back via !ReleaseTicket |
506 | | - {{{ticketed -> (!UpdateTicket) -> {{{ticketed}}} |
507 | | - {{{ticketed -> (!RedeemTicket) -> {{{allocated}}} (you have slivers) |
508 | | - {{{uninitialized}}} -> (!CreateSlivers) -> {{{allocated}}} and then via {{{changing}}} to {{{ready}}} |
509 | | - {{{allocated}}} (or {{{ticketed}}} when you also have {{{allocated}}} slivers)->(!DeleteSlivers) -> {{{closed}}} |
510 | | - {{{allocated}}} (or some {{{allocated}}} and some {{{ticketed}}}}) -> (Shutdown) -> {{{shutdown}}} |
511 | | - {{{shutdown}}} -> [some operator action] -> {{{closed}}} or {{{allocated}}} |
512 | | - {{{allocated}}} -> (!UpdateSlivers) -> whole is called {{{ticketed}}}, some slivers are {{{allocated}}} and some {{{ticketed}}} |
513 | | - Some slivers {{{allocated}} and some {{{ticketed}}} -> (!UpdateTicket) -> {{{allocated}}}+{{{ticketed}}} |
514 | | - {{{allocated}}}+{{{ticketed}}} -> (!ReleaseTicket) -> {{{allocated}}} |
515 | | - {{{allocated}}}+{{{ticketed}}} -> (!RedeemTicket) -> {{{allocated}}} |
516 | | }}} |
517 | | |
518 | | Note: some resources may not require an explicit 'start' operation. In this case !CreateSlivers may leave some slivers {{{ready}}}, skipping right past {{{allocated}}}. |
519 | | |
520 | | Summary of changes: |
521 | | - {{{configuring}}} becomes {{{changing}}}, which can be used in many other cases, in returns from !SliversStatus |
522 | | - New states {{{uninitialized}}}, {{{ticketed}}}, {{{allocated}}}, {{{closed}}}, and {{{shutdown}}} are added |
523 | | - State transitions for each method are defined |
524 | | - {{{am_specific_status}}} optional return defined |
525 | | - {{{geni_status}}} is returned by !SliversStatus, !RedeemTicket, !UpdateSlivers, !CreateSlivers, and !ActOnSlivers (if all relevant change sets are adopted). |
526 | | |
527 | | == Change Set F2: !ActOnSlivers == |
528 | | This change introduces a new method, providing a generic way to act on slivers in an AM or resource type specific way. This method shall be used to 'start' or 'stop' or 'restart' resources that have been allocated but not started by !CreateSlivers or !RedeemTicket. It may also be used to change the state of slivers (or their contained resources) in an aggregate or resource specific way. Some aggregates may use this method to change configuration details of allocated resources. This might include changing acceptable login keys. |
529 | | |
530 | | !ActOnSlivers takes a {{{command}}}, {{{urn}}}, {{{state}}}, and {{{options}}}. The method return is a struct that includes the {{{urn}}}, {{{geni_status}}} of the sliver(s), and any other AM and operation specific options. The URN may be a slice urn, meaning all slivers in that slice at this AM are effected. Or the URN may be a particular sliver URN. The {{{state}}} argument is one of the {{{geni_status}}} values, or an AM-specific value. The {{{state}}} meaning depends on the {{{command}}}, but typically indicates the desired or resulting new state of the sliver(s). If the AM wishes to return an aggregate specific sliver status, it should still return a valid {{{geni_status}}}, and use an additional entry to also return the aggregate specific state. The {{{command}}} argument is aggregate defined. This API does not specify how aggregates advertise valid commands. |
531 | | |
532 | | Three particular commands are specified however: {{{start}}}, {{{stop}}}, and {{{restart}}} (case insensitive). If an aggregate provides resources which require an explicit action to make {{{allocated}}} resources {{{ready}}} for experimenter use (booting, applying a configuration change) then the aggregate must make that operation available using these commands. These commands are used after !RedeemSlivers or when the {{{geni_donotstart}}} option is supplied to !CreateSlivers for example. |
533 | | |
534 | | For example, to start allocated resources: |
535 | | {{{ |
536 | | Arguments: |
537 | | command = start |
538 | | urn = <slice or sliver urn> |
539 | | state = ready |
540 | | options = <none required> |
541 | | Result: |
542 | | urn = <same as input> |
543 | | geni_status = changing or ready on success |
544 | | }}} |
545 | | |
546 | | FIXME: After !UpdateSlivers, does {{{start}}} on the slice start only new stuff? How do changes to existing resources take effect? Does {{{restart}}} on the slice restart everything or only changed things? Must the experimenter selectively {{{restart}}} changed things and use {{{start}}} to start new things? |
547 | | |
548 | | Method signature: |
549 | | {{{ |
550 | | struct ActOnSlivers(string command, string credentials[], string urn, string state, struct options) |
551 | | }}} |
552 | | |
553 | | Return struct: |
| 154 | |
| 155 | == Change Set F3: Sliver Allocation States and methods == |
| 156 | '''This change was discussed and adopted at the GEC13 Coding Sprint.''' |
| 157 | |
| 158 | For meeting minutes, see: [wiki:GEC13Agenda/CodingSprint the GEC13 Coding Sprint agenda page]. |
| 159 | |
| 160 | - We agreed to use two kinds of states: allocation states, and operational states. We put off discussion of operational states (i.e. is the node booted), noting however that this is critical. See Change Set F4. |
| 161 | - We debated whether the API should specify a limited number of states, or allow for aggregate or resource specific states. We agreed that for allocation states, the API should define a limited set of states, while operational states might be more permissive. |
| 162 | - We discussed the pros and cons of including a single all-in-one method to change allocation states, or a single method per desired transition. There is at least 1 case where there are 2 paths between the same 2 allocation states with very different meaning. As a result, we agreed to use a separate method per allocation state change. |
| 163 | |
| 164 | We agreed on 3 allocation states for slivers and an enumeration of methods for transitioning between those states. |
| 165 | |
| 166 | [[Image(sliver-alloc-states3.jpg)]] |
| 167 | |
| 168 | Allocation states: |
| 169 | 1. `geni_unallocated` (alternatively called 'null'). The sliver does not exist. This is the small black circle in typical state diagrams. |
| 170 | 2. `geni_allocated` (alternatively called 'offered' or 'promised'). The sliver exists, defines particular resources, and is in a sliver. The aggregate has not (if possible) done any time consuming or expensive work to instantiate the resources, provision them, or make it difficult to revert the slice to the state prior to allocating this sliver. This state is what the aggregate is offering the experimenter. |
| 171 | 3. `geni_provisioned`. The aggregate has started instantiating resources, and otherwise making changes to resources and the slice to make the resources available to the experimenter. At this point, operational states are valid to specify further when the resources are available for experimenter use. |
| 172 | |
| 173 | The key change is the addition of state 2, representing resources that have been allocated to a slice without provisioning the resources. This represents a cheap and un-doable resource allocation, such as we previously discussed in the context of tickets. This compares reasonably well to the 'transaction' proposal written up by Gary Wong (http://www.protogeni.net/trac/protogeni/wiki/AM_API_proposals). When a sliver is created and moved into state 2 (`geni_allocated`), the aggregate produces a manifest RSpec identifying which resources are included in the sliver. This is something like the current !CreateSliver, except that it does not provision nor start the resources. These resources are exclusively available to the containing sliver, but are not ready for use. In particular, allocating a sliver should be a cheap and quick operation, which the aggregate can readily un-do without impacting the state of slivers which are fully provisioned. For some aggregates, transitioning to this state may be a no-op. |
| 174 | |
| 175 | States 2 and 3 (`geni_allocated` and `geni_provisioned`) have aggregate and possibly resource specific timeouts. By convention the `geni_allocated` state timeout is typically short, like the {{{redeem_before}}} in ProtoGENI tickets, or the {{{commit_by}}} in Gary's transactions proposal. The `geni_provisioned` state timeout is the existing sliver expiration. If the client does not transition the sliver from `geni_allocated` to `geni_provisioned` before the end of the `geni_allocated` state timeout, the sliver reverts to `geni_unallocated`. If the experimenter needs more time, the experimenter should be allowed to request a renewal of either timeout. Note that typically the sliver expiration time (timeout for state 3, `geni_provisioned`) will be notably longer than the timeout for state 2, `geni_allocated`. |
| 176 | |
| 177 | State 3, `geni_provisioned`, is the state of the sliver allocation after the aggregate begins to instantiate the sliver. Note that fully provisioning a sliver may take noticeable time. This state also includes a timeout - the sliver expiration time (which is not necessarily related to the time it takes to provision a resource). !RenewSliver extends this timeout. For some aggregates and resource types, moving to this state from state 2 (`geni_allocated`) may be a no-op. |
| 178 | |
| 179 | If the transition from one state to another fails, the sliver shall remain in its original state. |
| 180 | |
| 181 | These are the only allocation states supported by this API. Since the state transitions are finite, but include potentially multiple transitions between the same two states, this API uses separate methods to perform each state transition, rather than a single method for requesting a new state for the sliver. |
| 182 | 1. Allocate moves 1+ slivers from `geni_unallocated` (state 1) to `geni_allocated` (state 2). This method can be described as creating an instance of the state machine for each sliver. If the aggregate cannot fully satisfy the request, the whole request fails. This is a change from the version 2 !CreateSliver, which also provisioned the resources, and 'started' them. That is Allocate does 1 of the 3 things that !CreateSliver did previously. |
| 183 | 2. Delete moves 1+ slivers from either state 2 or 3 (`geni_allocated` or `geni_provisioned`), back to state 1 (`geni_unallocated`). This is similar to the AM API version 2 !DeleteSliver. |
| 184 | 3. Renew, when given allocated slivers, requests an extended timeout for slivers in state 2 (`geni_allocated`). |
| 185 | 4. Renew can also be used to request an extended timeout for slivers in state 3 - the `geni_provisioned` state. That is, this method's semantics can be the same as !RenewSliver from AM API v2. |
| 186 | 5. Provision moves 1+ slivers from state 2 (`geni_allocated`) to state 3 (`geni_provisioned`). This is some of what version 2 !CreateSliver did. Note however that this does not 'start' the resources, or otherwise change their operational state. This method only fully instantiates the resources in the slice. This may be a no-op for some aggregates or resources. |
| 187 | |
| 188 | These states apply to each sliver individually. Logically, the state transition methods then take a single sliver URN. For convenience, these methods accept a list of sliver URNs, or a slice URN as a simple alias for all slivers in this slice at this aggregate. |
| 189 | |
| 190 | Since each method may operate on multiple slivers, each of these methods returns a list of structs as the value: |
| 191 | {{{ |
| 192 | value = [ |
| 193 | { |
| 194 | geni_sliver_urn: <string>, |
| 195 | geni_allocation_status: <string>, |
| 196 | geni_expires: <time when the sliver expires from its current state>, |
| 197 | <others AM or method specific> |
| 198 | <Provision returns geni_operational_status> |
| 199 | }, |
| 200 | ... |
| 201 | ] |
| 202 | }}} |
| 203 | |
| 204 | Allocate returns a single manifest RSpec, plus the above list of structs. |
| 205 | |
| 206 | Aggregates must be consistent across all these methods whether they are all or nothing, or support partial success. |
| 207 | |
| 208 | These methods all take a new option (aggregates must support it, clients do not need to supply it): |
| 209 | {{{ |
| 210 | geni_best_effort: <XML-RPC boolean 1/0, default 0> |
| 211 | }}} |
| 212 | If false, the client is requesting that the aggregate either fully satisfy the request, moving all listed slivers to the desired state, or fully fail the request, leaving all slivers in their original state. |
| 213 | If the aggregate cannot guarantee all or nothing success or failure given the included slivers and resource types, the aggregate shall fail the request, returning an appropriate error code. If this option is true, then some slivers may transition to the new state, and some note. Aggregates must examine the return closely to know the state of their slivers. |
| 214 | |
| 215 | '''Note''': Allocate remains (like v2 !CreateSliver) all or nothing (either the aggregate can allocate all desired resources as requested, or the call fails). |
| 216 | |
| 217 | '''Note''': These calls are synchronous - when they return, the slivers shall be in their final state. In particular, the transition from state 2 to 3 (`geni_allocated` to `geni_provisioned`) should be quick. The resource that is now in the 'provisioned' state may take a long time to actually be ready for operational use (e.g. imaging and booting the node) -- this remains true as in version 2 after !CreateSliver. Note that the `geni_allocated` state is by definition cheap, such that transitioning to this state should also be quick. |
| 218 | |
| 219 | !SliverStatus, where it currently includes {{{geni_status}}} for each `geni_resource`, shall now return {{{geni_allocation_status}}} with one of the above defined values, and {{{geni_operational_status}}}. The values of {{{geni_operational_status}}} are still under discussion. |
| 220 | |
| 221 | Currently, !SliverStatus returns a single {{{geni_status}}} for the entire slice at this aggregate. With this change, the top-level allocation status for the slice is not defined, and that field is not required. |
| 222 | |
| 223 | Open Questions: |
| 224 | - What about an !UpdateAllocations method, similar to !UpdateTickets or !UpdateTransactions from other similar proposals, for modifying allocated resources in place, without losing allocated resources? |
| 225 | |
| 226 | == Change Set F4: Sliver Operations Method == |
| 227 | This proposal was discussed on the geni-dev mailing list: http://lists.geni.net/pipermail/dev/2012-March/000743.html |
| 228 | |
| 229 | The canonical source for documentation on this proposal is here: https://openflow.stanford.edu/display/FOAM/GENI+-+PerformOperationalAction |
| 230 | |
| 231 | See Change Set F5 for a companion proposal for aggregates to advertise legal operational states and actions. |
| 232 | |
| 233 | {{{ |
| 234 | struct PerformOperationalAction (string urn[], struct credentials[], string action, |
| 235 | struct options={}) |
| 236 | }}} |
| 237 | |
| 255 | |
| 256 | Performs the given action on the given sliver_urn(s) (or slice_urn as a proxy for "all slivers"). Actions are constrained to the set of default GENI actions, as well as resource-specific actions which reasonably perform operational resource tasks as defined by the aggregate manager for the given resource type. This method is not intended to allow for reconfiguration of options found in the request rspec. Aggregate Managers SHOULD return an error code of `13` (`UNSUPPORTED`) if they do not support a given action for a given resource. Actions are performed on all slivers, or none - if an action cannot be performed on a sliver given, the entire operation MUST fail. Passing the option `geni_best_effort` with a value of true allows for partial success (this option defaults to false if not supplied). |
| 257 | |
| 258 | An AM SHOULD constrain actions based on the current operational state of the resource, such that for example attempting to perform the action `geni_stop` on a resource that is `geni_ready_busy` or `geni_configuring` or `geni_stopping` will fail, but SHOULD also be idempotent for all actions which result in a steady state. |
| 259 | |
| 260 | `geni_operational_status` MUST be the current operational status of the sliver after this action (as would be returned by !SliverStatus). An optional `geni_resource_status field` MAY be returned for each sliver which contains a resource-specific status that may be more nuanced than the options for `geni_operational_status`. |
| 261 | |
| 262 | Calling this method with a slice_urn functions as if all the child sliver_urn's had been passed in - specifically the action is performed on all slivers and all sliver_urn's and their statuses are returned. No status is returned for the slice as a whole. |
| 263 | |
| 264 | This is a fast synchronous operation, and MAY start long-running sliver transitions whose status can be queried using !SliverStatus. |
| 265 | |
| 266 | This method should only be called, and is only valid, when the sliver is fully allocated. In particular, if Change Set F5 is adopted, this method is only applicable for slivers not in the `geni_pending_allocation` state. |
| 267 | |
| 268 | == Change Set F5: Sliver Operational States == |
| 269 | Currently, `geni_status` in !SliverStatus can have values `configuring`, `ready`, `failed`, `unknown`. |
| 270 | |
| 271 | This proposal modifies that list, and renames those to use the standard 'geni_' prefix. |
| 272 | |
| 273 | These states would be reported by various AM API methods, specifically !SliverStatus, and would be used in reasoning about valid operations in !PerformOperationalAction |
| 274 | |
| 275 | The AM API defines a few operational states with particular semantics. AMs are not required to support them for a given set of resources, but if they use them, they must follow the given semantics. AMs are however STRONGLY encouraged to support them, to provide maximum utility. There is one state that AMs are required to support, `geni_pending_allocation`, for a sliver which has not been fully allocated and provisioned. |
| 276 | |
| 277 | Similarly, the API defines a few operational actions: these need not be supported. AMs are encouraged to support these if possible, but only if they can be supported following the defined semantics. |
| 278 | |
| 279 | AMs may have their own operational states/state-machine internally. AMs are required to advertise such states and actions that experimenters may see or use, by using Ad RSpec extensions. Operational states which the experimenter never sees, need not be advertised. Operational states and actions are generally by resource type. The standard RSpec extension attaches such definitions to the `sliver_type` element of RSpecs. |
| 280 | |
| 281 | '''TODO''': Jon Duerig will propose this extension, with examples covering PG/Emulab sliver_types. |
| 282 | |
| 283 | Tools must use the operational states and actions advertisement to determine what operational actions to offer to experimenters, and what actions to perform for the experimenter. Tools may choose to offer actions which the tool does not understand, relying on the experimenter to understand the meaning of the new action. |
| 284 | |
| 285 | States should be defined in terms of (a) whether the resource is accessible to the experimenter on the data or control planes, (b) whether an experimenter action is required to change from this state, |
| 286 | and if so, (c) what action or actions are useful. If the resource will change states without explicit experimenter action, what is the expected next state on success. |
| 287 | |
| 288 | Note that states represent the AM's view of the operational condition of the resource. This state represents what the AM has done or learned about the resource, but experimenter actions may cause failures that the AM does not know about. |
| 289 | |
| 290 | Any operational action may fail. When this happens, the API method should return an error code. The sliver may remain in the original state. In some cases, the sliver may transition to the `geni_failed` state. |
| 291 | |
| 292 | There is no `busy` state. Instead, AMs are encouraged to define separate such transition states for each separate transition path, allowing experimenters to distinguish the start and end states for this transition. |
| 293 | |
| 294 | `shutdown` is not an operational state for a sliver. The Shutdown() API method applies to an entire slice. |
| 295 | |
| 296 | Operational states are generally only valid for slivers which have been provisioned (`geni_provisioned` allocation state). |
| 297 | |
| 298 | GENI defined operational states: |
| 299 | - `geni_pending_allocation`: A wait state. The sliver is still being allocated and provisioned, and other operational states are not yet valid. !PerformOperationalAction may not yet be called on this sliver. For example, the sliver is in allocation state `geni_provisioned`, but has not been fully provisioned (e.g., the VM has not been fully imaged). Once the sliver has been fully allocated, the AM will transition the sliver to some other valid operational state, as specified by the advertised operational state machine. Common next states are `geni_notready`, `geni_ready`, and `geni_failed`. |
| 300 | - `geni_notready`: A final state. The resource is not usable / accessible by the experimenter, and requires explicit experimenter action before it is usable/accessible by the experimenter. For some resources, `geni_start` will move the resource out of this state and towards `geni_ready`. |
| 301 | - `geni_configuring`: A wait state. The resource is in process of changing to `geni_ready`, and on success will do so without additional experimenter action. For example, the resource may be powering on. |
| 302 | - `geni_stopping`: A wait state. The resource is in process of changing to `geni_notready`, and on success will do so without additional experimenter action. For example, the resource may be powering off. |
| 303 | - `geni_ready`: A final state. The resource is usable/accessible by the experimenter, and ready for slice operations. |
| 304 | - `geni_ready_busy`: A wait state. The resource is performing some operational action, but remains accessible/usable by the experimenter. Upon completion of the action, the resource will return to `geni_ready`. |
| 305 | - `geni_failed`: A final state. Some operational action failed, rendering the resource unusable. An administrator action, undefined by this API, may be required to return the resource to another operational state. |
| 306 | |
| 307 | GENI defined operational actions: |
| 308 | - `geni_start`: This action results in the sliver becoming `geni_ready` eventually. The operation may fail (move to `geni_failed`), or move through some number of transition states. See EG booting a VM. |
| 309 | - `geni_restart`: This action results in the sliver becoming `geni_ready` eventually. The operation may fail (move to `geni_failed`), or move through some number of transition states. During this operation, the resource may or may not remain accessible. Dynamic state associated with this resource may be lost by performing this operation. See EG re-booting a VM. |
| 310 | - `geni_stop`: This action results in the sliver becoming `geni_notready` eventually. The operation may fail (move to `geni_failed`), or move through some number of transition states. See EG powering down a VM. |
| 311 | |
| 312 | Actions are performed using the above proposed !PerformOperationalAction. |
774 | | |
775 | | = Change Set L: Change SFA credentials' privileges = |
776 | | Our goal is to simplify and standardize privilege strings used in SFA credentials. Currently there are wildcards, bind, embed, and others. They are confusing. We also want extensibility to use these credentials elsewhere in future. |
777 | | |
778 | | Credentials should support these kinds of operations: |
779 | | - Learn about the slice |
780 | | - Add/Modify/Delete resources in the slice |
781 | | - Read slice details like I&M? |
782 | | - Use the slice |
783 | | - Operator shutdown |
784 | | |
785 | | Proposal - Replace all existing privileges with only the following possible strings (case insensitive): |
786 | | - {{{CanWrite}}} |
787 | | - If present in a valid slice credential, aggregates may permit !CreateSlivers, !RenewSlivers, !DeleteSlivers, Shutdown, plus new methods !ActOnSlivers, !UpdateSlivers, !GetTicket, !RedeemTicket, !UpdateTicket, !ReleaseTicket |
788 | | - Thus it replaces {{{bind}}}, {{{embed}}}, {{{control}}}, {{{instantiate}}}, {{{sa}}}, {{{pi}}}, or {{{*}}} in various places |
789 | | - {{{CanRead}}} |
790 | | - If present in a valid slice credential, aggregates may permit !ListResources with a {{{slice_urn}}}, !SliversStatus |
791 | | - Thus it replaces {{{info}}} or {{{*}}} in various places |
792 | | - {{{CanReadDetails}}} |
793 | | - {{{CanUse}}} |
794 | | |
795 | | Note that those last 2 may never get used, but are there in case I&M or opt-in make those useful. |
796 | | |
797 | | Note also that operators who wish to shut down a slice would need a slice credential with the {{{CanWrite}}} privilege. |
798 | | |
799 | | Privilege and credential semantics are defined as follows: |
800 | | - Aggregates may only grant access using current SFA credentials to a method if at least one such valid credential: |
801 | | - grants the required privilege or privileges (if any) |
802 | | - to the caller of the API method |
803 | | - (identified by their SSL client certificate and the {{{owner_gid}}} in the credential) |
804 | | - over the slice (if any) on which they are operating |
805 | | - ({{{target_gid}}} in the credential). |
806 | | - Other privileges may be present in the same or other credentials, and other non-SFA credentials may be used to authorize actions (per [#ChangeSetG:Credentialsaregeneralauthorizationtokens. Change Set G]). |
807 | | - Local aggregate policy may grant or deny access to a particular method regardless of the presence of a valid credential granting the required privilege. This depends in part on federation policy governing aggregates. |
808 | | - Some operations (e.g. !GetVersion) may either simply require a valid credential with no particular privilege, or have no {{{credentials}}} argument at all. |
809 | | |
810 | | Note also that some current AMs do not require any particular privileges to do !ListResources, even with a {{{slice_urn}}}. This change suggests that aggregates require a valid slice credential with {{{CanRead}}} privileges to authorize this operation using current slice credentials. |
| 533 | - User URNs (which contain the authority name and the username) are required to be temporally and globally unique. |
| 534 | |
| 535 | |
| 536 | = Change Set M: New Method Signatures = |
| 537 | If all the other adopted change set proposals are adopted, there will be new method signatures. |
| 538 | |
| 539 | In some cases, the proposals are not clear in terms of the details of the resulting method signatures. This proposal consolidates those separate proposals, to propose a new set of method signatures. |
| 540 | |
| 541 | There are a few other small changes that this change set covers. |
| 542 | |
| 543 | The details of the proposed final method signatures are listed [wiki:AaronHelsinger/GAPI_AM_API_DRAFT/MethodSignatures on the Draft method signatures summary page]. |
| 544 | |
| 545 | == M1: users struct an option == |
| 546 | Previously, the !CreateSliver method took a `users[]` struct to specify information for logging in to resources. But this struct is not universally applicable. This change moves that struct to be an option within the `options` struct, named `geni_users[]`. All other semantics and syntax for this argument remain the same from AM API version 2. |
| 547 | |
| 548 | == M2: Split !ListResources == |
| 549 | Currently, !ListResources has two forms: (1) get an advertisement general to the aggregate, and (2) get a manifest specific to a slice. This proposal splits those two modes into two separate methods, !ListResources, and Describe. |
| 550 | |
| 551 | !ListResources would no longer take a `slice_urn` option, and no longer ever return a manifest RSpec. |
| 552 | |
| 553 | Describe would be used to achieve that same functionality. |
| 554 | {{{ |
| 555 | struct Describe(string urns[], struct credentials[], struct options[]) |
| 556 | }}} |
| 557 | |
| 558 | Where options include: |
| 559 | {{{ |
| 560 | { |
| 561 | boolean geni_compressed <optional>; |
| 562 | struct geni_rspec_version { |
| 563 | string type; |
| 564 | string version; |
| 565 | }; |
| 566 | } |
| 567 | }}} |
| 568 | |
| 569 | Return struct: |
| 570 | {{{ |
| 571 | { |
| 572 | geni_rspec: <geni.rspec, Manifest> |
| 573 | geni_urn: <string slice urn of the containing slice> |
| 574 | geni_slivers: [ |
| 575 | { |
| 576 | geni_sliver_urn: <string sliver urn> |
| 577 | geni_expires: <dateTime.rfc3339 allocation expiration string, as in geni_expires from SliversStatus>, |
| 578 | geni_allocation_status: <string sliver state - allocated or ?? >, |
| 579 | geni_operational_status: <string sliver operational state> |
| 580 | }, |
| 581 | ... |
| 582 | ] |
| 583 | } |
| 584 | }}} |
| 585 | |
| 586 | Aggregates are expected to combine the manifests of all requested slivers into a single manifest RSpec. Note that a manifest returned here for only some of the slivers in a slice at this aggregate, may contain references to resources not described in this manifest (they are in other slivers). As a result, such manifests may not be directly usable as a subsequent request. |
846 | | ---- |
847 | | = Change summary - method signatures = |
848 | | If all change sets listed here are adopted, the final method signatures will be as follows: |
849 | | |
850 | | == !GetVersion == |
851 | | {{{ |
852 | | struct GetVersion([optional: struct options]) |
853 | | }}} |
854 | | |
855 | | Return struct: |
856 | | {{{ |
857 | | |
858 | | { |
859 | | int geni_api; |
860 | | struct geni_api_versions { |
861 | | URL <this API version #>; # value is a URL, name is a number |
862 | | [optional: other supported API versions and the URLs where they run] |
863 | | } |
864 | | array geni_request_rspec_versions of { |
865 | | string type; |
866 | | string version; |
867 | | string schema; |
868 | | string namespace; |
869 | | array extensions of string; |
870 | | }; |
871 | | array geni_ad_rspec_versions of { |
872 | | string type; |
873 | | string version; |
874 | | string schema; |
875 | | string namespace; |
876 | | array extensions of string; |
877 | | }; |
878 | | } |
879 | | }}} |
880 | | |
881 | | == !ListResources == |
882 | | {{{ |
883 | | struct ListResources(string credentials[], struct options) |
884 | | }}} |
885 | | |
886 | | Where options include: |
887 | | {{{ |
888 | | { |
889 | | boolean geni_available; |
890 | | boolean geni_compressed; |
891 | | string geni_slice_urn; |
892 | | struct geni_rspec_version { |
893 | | string type; |
894 | | string version; |
895 | | }; |
896 | | string geni_start_time; |
897 | | string geni_end_time; |
898 | | } |
899 | | }}} |
900 | | |
901 | | Return struct: |
902 | | {{{ |
903 | | { |
904 | | rspec (ad or Manifest - may be empty though) |
905 | | tickets[] (required but may be an empty list) |
906 | | } |
907 | | }}} |
908 | | |
909 | | == !GetTicket == |
910 | | {{{ |
911 | | struct GetTicket (string slice_urn, string credentials[], string requestRSpec, |
912 | | struct options) |
913 | | }}} |
914 | | |
915 | | Options include {{{geni_start_time}}} and {{{geni_end_time}}} |
916 | | |
917 | | Return: ticket |
918 | | |
919 | | == !UpdateTicket == |
920 | | {{{ |
921 | | struct UpdateTicket(string slice_urn, string credentials[], string requestRSpec, |
922 | | string ticket, struct options) |
923 | | }}} |
924 | | |
925 | | Options include {{{geni_start_time}}} and {{{geni_end_time}}} |
926 | | |
927 | | Return: ticket |
928 | | |
929 | | == !RedeemTicket == |
930 | | {{{ |
931 | | struct RedeemTicket(string slice_urn, string credentials[], string ticket, |
932 | | struct users[], struct options) |
933 | | }}} |
934 | | |
935 | | Options include {{{geni_auto_start}}} |
936 | | |
937 | | Return struct: |
938 | | {{{ |
939 | | { |
940 | | string rspec=<manifest>, |
941 | | geni_start_time=<optional (may be omitted altogether): now if not specified>, |
942 | | geni_expires=<RFC3339 sliver expiration>, |
943 | | string geni_status=<sliver state - allocated or changing or ready>, |
944 | | string geni_state_guarantee=<promise from AM of what experimenter state will be lost on trying to 'start' this allocation>, |
945 | | <others that are AM specific> |
946 | | } |
947 | | }}} |
948 | | |
949 | | == !UpdateSlivers == |
950 | | {{{ |
951 | | struct UpdateSlivers(string slice_urn, string credentials[], string rspec, |
952 | | struct options) |
953 | | }}} |
954 | | |
955 | | Options include {{{geni_start_time}}} and {{{geni_end_time}}} |
956 | | |
957 | | Return struct: |
958 | | {{{ |
959 | | { |
960 | | string ticket=<ticket> |
961 | | string geni_status=<sliver state - ticketed>, |
962 | | string geni_state_guarantee=<promise from AM of what experimenter state will be lost on trying to 'start' this allocation>, |
963 | | <others that are AM specific> |
964 | | } |
965 | | }}} |
966 | | |
967 | | == !ReleaseTicket == |
968 | | {{{ |
969 | | struct ReleaseTicket(string slice_urn, string credentials[], string ticket, struct options) |
970 | | }}} |
971 | | |
972 | | Return: boolean |
973 | | |
974 | | == !CreateSlivers == |
975 | | {{{ |
976 | | struct CreateSlivers(string slice_urn, |
977 | | string credentials[], |
978 | | string rspec, |
979 | | struct users[], |
980 | | struct options) |
981 | | }}} |
982 | | |
983 | | Options include: |
984 | | {{{ |
985 | | { |
986 | | boolean geni_donotstart (optional), |
987 | | string geni_start_time <datetime> (optional), |
988 | | string geni_end_time <datetime> (optional) |
989 | | } |
990 | | }}} |
991 | | |
992 | | Return struct: |
993 | | {{{ |
994 | | { |
995 | | string rspec=<manifest>, |
996 | | geni_start_time=<optional (may be omitted altogether): now if not specified>, |
997 | | geni_expires=<RFC3339 sliver expiration, as in geni_expires from SliversStatus>, |
998 | | string geni_status=<sliver state - allocated or changing or ready>, |
999 | | <others that are AM specific> |
1000 | | } |
1001 | | }}} |
1002 | | |
1003 | | == !RenewSlivers == |
1004 | | {{{ |
1005 | | struct RenewSlivers(string urn, |
1006 | | string credentials[], |
1007 | | string expiration_time, |
1008 | | struct options) |
1009 | | }}} |
1010 | | Return: boolean |
1011 | | |
1012 | | == !SliversStatus == |
1013 | | {{{ |
1014 | | struct SliversStatus(string slice_urn, string credentials[], struct options) |
1015 | | }}} |
1016 | | |
1017 | | Return: |
1018 | | {{{ |
1019 | | { |
1020 | | string geni_urn: <sliver URN> |
1021 | | string geni_status: ready |
1022 | | geni_expires: <datetime of expiration> |
1023 | | struct geni_resources: [ { geni_urn: <resource URN> |
1024 | | geni_status: ready |
1025 | | geni_expires: <datetime of individual sliver expiration> |
1026 | | geni_error: ''}, |
1027 | | { geni_urn: <resource URN> |
1028 | | geni_status: ready |
1029 | | geni_expires: <datetime of individual sliver expiration> |
1030 | | geni_error: ''} |
1031 | | ] |
1032 | | } |
1033 | | }}} |
1034 | | |
1035 | | Where for individual resources this block may be returned: |
1036 | | {{{ |
1037 | | 'users' => [{'urn' => $user1_urn. |
1038 | | 'login' => $login, |
1039 | | 'protocol' => [ssh, or ?], |
1040 | | 'port' => [22 or ?], |
1041 | | 'keys' => [...] }, |
1042 | | {'urn' => $user2_urn. |
1043 | | 'login' => $login, |
1044 | | 'protocol' => [ssh, or ?], |
1045 | | 'port' => [22 or ?], |
1046 | | 'keys' => [...] } |
1047 | | ] |
1048 | | }}} |
1049 | | |
1050 | | == !ActOnSlivers == |
1051 | | {{{ |
1052 | | struct ActOnSlivers(string command, string credentials[], string urn, string state, struct options) |
1053 | | }}} |
1054 | | |
1055 | | Return struct: |
1056 | | {{{ |
1057 | | { |
1058 | | string urn=<urn of sliver or slice>, |
1059 | | string geni_status=<new state of the slivers>, |
1060 | | <other entries specific to the AM or resources - specifically am_specific_status> |
1061 | | } |
1062 | | }}} |
1063 | | |
1064 | | == !DeleteSlivers == |
1065 | | {{{ |
1066 | | struct DeleteSlivers(string urn, string credentials[], struct options) |
1067 | | }}} |
1068 | | |
1069 | | Return: boolean |
1070 | | |
1071 | | == Shutdown == |
1072 | | {{{ |
1073 | | struct Shutdown(string slice_urn, string credentials[], struct options) |
1074 | | }}} |
1075 | | |
1076 | | Return: boolean |
| 619 | Update stitching schema per changes here: https://geni.maxgigapop.net/twiki/bin/view/GENI/NetworkStitchingGeniApiAndRspec |