Opened 7 years ago

Last modified 7 years ago

#179 new

Noticable delay between sliver ready and interfaces being configured on allocated node

Reported by: lnevers@bbn.com Owned by: somebody
Priority: major Milestone:
Component: Experiment Version: SPIRAL5
Keywords: confirmation tests Cc:
Dependencies:

Description

In the past few days of testing, I have noticed that some interfaces are not up, when I logged into the allocated nodes. So this morning, a ran several tests where I captured some timing and found that once the sliver was ready, and I logged into the node it can take between 1 1/2 to 2 minutes before the interface is configured. This does not occur for all node, but approx 40-50% of the nodes.

Also while I was logged into the node waiting for the interface to come up, the following was reported:

Message from syslogd@debian at May  6 13:41:41 ...
 kernel:[  179.751186] ------------[ cut here ]------------

Message from syslogd@debian at May  6 13:41:41 ...
 kernel:[  179.753282] invalid opcode: 0000 [#1] SMP 

Message from syslogd@debian at May  6 13:41:41 ...
 kernel:[  179.753796] last sysfs file: /sys/module/virtio/initstate

Message from syslogd@debian at May  6 13:41:41 ...
 kernel:[  179.771567] Stack:

Message from syslogd@debian at May  6 13:41:41 ...
 kernel:[  179.774696] Call Trace:

Message from syslogd@debian at May  6 13:41:41 ...
 kernel:[  179.786608] Code: 3d 81 31 c0 e8 34 cd 15 00 4d 8b 24 24 49 8b 04 24 4d 39 
ec 0f 18 08 75 b0 48 8b 6d 28 e9 95 00 00 00 f6 85 88 06 00 00 04 75 04 <0f> 0b eb fe 
48 8b 5d 18 48 85 db 74 7b 48 39 1b 75 32 48 c7 c7 

The sliver which saw the above is named "lnxlg" reserved via the ExoSM in the BBN rack. The node showing the problem is "VM-1" "192.1.242.5" and is still running.

Attaching boot messages.

Attachments (1)

boot-messages.txt (31.4 KB) - added by lnevers@bbn.com 7 years ago.

Download all attachments as: .zip

Change History (2)

Changed 7 years ago by lnevers@bbn.com

Attachment: boot-messages.txt added

comment:1 Changed 7 years ago by lnevers@bbn.com

Just had 2 more nodes in take 2 minutes to bring up interfaces, after becoming ready... Sliver is "lnxlg2" and both nodes show the following in their boot message:

[  231.945491] virtio-pci 0000:00:06.0: enabling device (0000 -> 0003)
[  231.946506] ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 11
[  231.947295] virtio-pci 0000:00:06.0: PCI INT A -> Link[LNKB] -> GSI 11 (level, high) -> IRQ 11
[  231.949388] virtio-pci 0000:00:06.0: setting latency timer to 64
[  231.951818]   alloc irq_desc for 26 on node -1
[  231.951820]   alloc kstat_irqs on node -1
[  231.951856] virtio-pci 0000:00:06.0: irq 26 for MSI/MSI-X
[  231.951858]   alloc irq_desc for 27 on node -1
[  231.951860]   alloc kstat_irqs on node -1
[  231.951876] virtio-pci 0000:00:06.0: irq 27 for MSI/MSI-X
[  231.951878]   alloc irq_desc for 28 on node -1
[  231.951879]   alloc kstat_irqs on node -1
[  231.951895] virtio-pci 0000:00:06.0: irq 28 for MSI/MSI-X

Not sure if this is important, but capturing just in case.

Note: See TracTickets for help on using tickets.