Opened 11 years ago
Last modified 10 years ago
#129 new
Experiments fails during provisioning for no image or obsolete request rather than reject at create
Reported by: | lnevers@bbn.com | Owned by: | vjo@cs.duke.edu |
---|---|---|---|
Priority: | minor | Milestone: | |
Component: | Experiment | Version: | SPIRAL5 |
Keywords: | Cc: | ||
Dependencies: |
Description
When an image is requested that is obsolete or when an image is not specified in the RSpec, there is no checking. The sliver will eventually fail with an "Error during join for unit" after the sliver has been ticketed and during the configuration. Should the requests be rejected as invalid due when the rspec is requesting a node without specifying a disk_image or when an obsolete image is used?
-> Example: Sliver lnexo is a 1 vm sliver request with an rspec that does not include an image
$ omni.py sliverstatus -a eg-sm lnexo INFO:omni:Loading config file /home/lnevers/.gcf/omni_config INFO:omni:Using control framework pg INFO:omni:Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+lnexo expires on 2012-12-11 00:00:00 UTC INFO:omni:Substituting AM nickname eg-sm with URL https://geni.renci.org:11443/orca/xmlrpc, URN unspecified_AM_URN INFO:omni:Status of Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+lnexo: INFO:omni:Sliver status for Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+lnexo at AM URL https://geni.renci.org:11443/orca/xmlrpc INFO:omni:{ "geni_status": "failed", "geni_urn": "urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+lnexo", "geni_resources": [ { "orca_expires": "Tue Dec 18 14:12:48 EST 2012", "geni_urn": "urn:publicid:IDN+exogeni.net:bbnvmsite+sliver+432e72ce-df8c-4514-bdc0-1d2a43600c46#geni1", "geni_error": "Reservation 9eaa8d30-c7d2-4e47-ae3b-696fd7535e1b (Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+lnexo) is in state [Failed,None], err=resources failed to join: Error during join for unit: 704EC8FB [1]: unable to create instance: exit code 1, \n", "geni_status": "Failed" } ] } INFO:omni: ------------------------------------------------------------ INFO:omni: Completed sliverstatus: Options as run: aggregate: ['eg-sm'] framework: pg Args: sliverstatus lnexo Result Summary: Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+lnexo expires on 2012-12-11 00:00:00 UTC Returned status of slivers on 1 of 1 possible aggregates. INFO:omni: ============================================================
-> Sliver lnexo2 is a 1 vm sliver request with an rspec that does include an image that was supported previous to the upgrade.
$ omni.py sliverstatus -a eg-sm lnexo2 INFO:omni:Loading config file /home/lnevers/.gcf/omni_config INFO:omni:Using control framework pg INFO:omni:Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+lnexo2 expires on 2012-12-05 21:10:47 UTC INFO:omni:Substituting AM nickname eg-sm with URL https://geni.renci.org:11443/orca/xmlrpc, URN unspecified_AM_URN INFO:omni:Status of Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+lnexo2: INFO:omni:Sliver status for Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+lnexo2 at AM URL https://geni.renci.org:11443/orca/xmlrpc INFO:omni:{ "geni_status": "failed", "geni_urn": "urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+lnexo2", "geni_resources": [ { "orca_expires": "Tue Dec 18 14:16:23 EST 2012", "geni_urn": "urn:publicid:IDN+exogeni.net:bbnvmsite+sliver+1f4cd917-db17-492a-af92-1322335c0625#geni1", "geni_error": "Reservation 86558ca4-c593-4835-af1f-24c7c2e6d5c9 (Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+lnexo2) is in state [Failed,None], err=resources failed to join: Error during join for unit: 71EA3B64 [1]: unable to create instance: exit code 1, \n", "geni_status": "Failed" } ] } INFO:omni: ------------------------------------------------------------ INFO:omni: Completed sliverstatus: Options as run: aggregate: ['eg-sm'] framework: pg Args: sliverstatus lnexo2 Result Summary: Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+lnexo2 expires on 2012-12-05 21:10:47 UTC Returned status of slivers on 1 of 1 possible aggregates. INFO:omni: ============================================================
Change History (5)
comment:1 Changed 10 years ago by
comment:3 Changed 10 years ago by
Verified the two scenarios in this ticket:
Scenario 1: Sliver request for VM does not include a disk_image defined. RSpec contained:
<node client_id="geni1" component_manager_id="urn:publicid:IDN+bbnvmsite+authority+cm"> <sliver_type name="m1.small"> </sliver_type> </node>
Result:
VM is created with image Linux debian 2.6.32-5-amd64 #1 SMP Mon Jan 16 16:22:28 UTC 2012 x86_64 GNU/Linux
Scenario 2: Sliver request for VM that includes an disk_image that does not exist at the specified URL. Tested by using the previous RSpec and adding one line that requests the following image:
<disk_image name="http://geni-images.renci.org/images/standard/debian/does-not-exist.xml" version="42f53b64cfe44dd1607867f04b7b533bb67ade1e" />
Result: The request fails with:
"geni_resources": [ { "orca_expires": "Thu Apr 11 17:10:33 UTC 2013", "geni_urn": "urn:publicid:IDN+exogeni.net:bbnvmsite+sliver+32852d34-feb4-442b-9085-1210b0586fae#geni1", "geni_error": "Reservation debd626a-9bb6-415c-934c-ba8c69b7526b (Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+none3) is in state [Failed,None], err=resources failed to join: (no details)\n", "geni_status": "Failed" }
Additional Scenario
In one of the attempts, where "name" is a non-existing URL and "version" from a valid image. Result: The image from the "version" field is loaded.
comment:4 Changed 10 years ago by
Priority: | major → minor |
---|
comment:5 Changed 10 years ago by
Owner: | changed from somebody to vjo@cs.duke.edu |
---|
Behavior #3 (additional scenario) should be fixed. Victor will do it in the fullness of time. ImagePRoxy looks images up by hash, so if it is one it has seen, URL is ignored. It shouldn't be ignored.
Checked status for the two error scenarios in this ticket and found the following:
Scenario 1: Sliver request for VM does not include a disk_image defined.
The request now fails at create sliver time and provides a helpful error:
An error is returned, but a later attempt to create a sliver with the same name (~ 5 minute) showed a duplicate sliver URN. No clean up had occurred.
Scenario 2: Sliver request for VM includes an disk_image that does not exist at the specified URL.
The request to create a sliver is accepted and the sliver becomes ready but I am not able to login to the assigned node.