Opened 11 years ago

Closed 11 years ago

#165 closed (fixed)

gmoc.py relational download hangs sometimes

Reported by: chaos@bbn.com Owned by: somebody
Priority: major Milestone: GPO GEC17 priorities
Component: Clients Version:
Keywords: v1.2.16 Cc:
Dependencies:

Description

Occasionally, gmoc.py relational download hangs while trying to obtain object data from gmoc-db. Since there is no timeout, it hangs forever and does not retry.

  • Each connection to gmoc-db should time out after a certain amount of time
  • Ideally, connections should retry once or twice before giving up on the entire download

Change History (5)

comment:1 Changed 11 years ago by chaos@bbn.com

Keywords: v1.2.16 added

comment:2 Changed 11 years ago by chaos@bbn.com

I had to kill another hung retrieval process which had gotten wedged over the weekend, just now.

comment:3 Changed 11 years ago by chaos@bbn.com

Milestone: GPO GEC17 priorities

Listing this ticket as a GPO priority because it is causing occasional monitoring failures which we have to resolve by hand to unwedge our monitoring process.

comment:4 Changed 11 years ago by chaos@bbn.com

gmoc.py-1.2.17 should address both of these:

  • GMOCClient() now has optional timeout and retries options
  • The timeout option is passed along to HTTP connections, and defaults to 30 seconds
  • The retries option is used to retry downloads a few times when they fail, and defaults to 3

I've installed 1.2.17 on our monitoring server, where we have been seeing noisy failures (maybe mitigated by retries) on the order of once a day, and have been seeing hangs (maybe mitigated by timeout) on the order of once every few weeks. I'll follow up on whether the changes seem to help.

comment:5 Changed 11 years ago by chaos@bbn.com

Resolution: fixed
Status: newclosed

Since a change was made and this is an intermittent problem, i'll go ahead and close it. I might need to reopen later if it turns out we're still seeing hangs. But the "hang forever" behavior only arises every few weeks, so it'll take awhile to get good data.

Note: See TracTickets for help on using tickets.