Opened 10 years ago

Closed 10 years ago

#138 closed (fixed)

There is no DNS resolution for the two IBM switches on the BBN head node

Reported by: lnevers@bbn.com Owned by: jonmills@renci.org
Priority: major Milestone:
Component: Experiment Version: SPIRAL5
Keywords: confirmation tests Cc:
Dependencies:

Description

The switch hostnames "bbn-8052.bbn.xo" and "bbn-8264.bbn.xo" no longer resolve on the BBN head node:

[lnevers@bbn-hn opt]$ ssh bbn-8064.bbn.xo
ssh: Could not resolve hostname bbn-8064.bbn.xo: Name or service not known
[lnevers@bbn-hn opt]$ ssh bbn-8052.bbn.xo
ssh: Could not resolve hostname bbn-8052.bbn.xo: Name or service not known

Change History (13)

comment:1 Changed 10 years ago by lnevers@bbn.com

A can now get to 8052 switch, but not the 8064 switch:

[lnevers@bbn-hn ~]$ ssh bbn-8064.bbn.xo
ssh: Could not resolve hostname bbn-8064.bbn.xo: Name or service not known
[lnevers@bbn-hn ~]$ ssh bbn-8052.bbn.xo
The authenticity of host 'bbn-8052.bbn.xo (192.168.103.2)' can't be established.
DSA key fingerprint is 89:b6:13:30:a5:74:e3:3e:a6:aa:71:7a:91:6e:80:fd.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'bbn-8052.bbn.xo,192.168.103.2' (DSA) to the list of known hosts.
Enter radius password: 

IBM Networking Operating System RackSwitch G8052.

bbn-8052.bbn.xo>

comment:2 Changed 10 years ago by chaos@bbn.com

Luisa: it's 8264, not 8064. That work any better?

comment:3 Changed 10 years ago by lnevers@bbn.com

Thanks Chaos, I can connect to the OpeFlow switch.

With the right switch names, still no resolution at RENCI rack:

[lnevers@rci-hn ~]$ ssh bbn-8264.rci.xo
ssh: Could not resolve hostname bbn-8264.rci.xo: Name or service not known
[lnevers@rci-hn ~]$ ssh bbn-8052.rci.xo
ssh: Could not resolve hostname bbn-8052.rci.xo: Name or service not known
[lnevers@rci-hn ~]$ 

comment:4 Changed 10 years ago by lnevers@bbn.com

On 12/10/12 4:08 PM, Jonathan Mills wrote:

There is no bbn-8052.rci.xo

It is 8052.renci.xo, and it should resolv from rci-hn just fine. It should not resolve from bbn-hn. That is anticipated behavior.

Sorry about messing up half the name. BBN use the names 'bbn-8052.bbn.xo' and 'bbn-8264.bbn.xo'. RENCI uses the names '8052.renci.xo' (and '8264.renci.xo')

Is there naming conventions for the next site?

Also still does not resolve:

[lnevers@rci-hn ~]$ ssh 8052.renci.xo
ssh: Could not resolve hostname 8052.renci.xo: Name or service not known
[lnevers@rci-hn ~]$ ssh 8264.renci.xo
ssh: Could not resolve hostname 8264.renci.xo: Name or service not known
[lnevers@rci-hn ~]$ 

comment:5 Changed 10 years ago by jonmills@renci.org

# # Private Slave Config # acl "exogeni_mgt_subnets" { 192.168.10X.X

127.0.0.1; # localhost 192.168.100.0/24; # ACIS 192.168.101.0/24; # reserved 192.168.102.0/24; # RCI 192.168.103.0/24; # BBN 192.168.104.0/24; # NICTA (no SSG5) 192.168.105.0/24; # FIU 192.168.106.0/24; # UH

};

comment:6 in reply to:  5 Changed 10 years ago by lnevers@bbn.com

How do the subnets in the previous update relate to my trying to figure out how the connect to each of the switches in the rack?

Should I be able to connect to a hostname alias? (e.g. site-8052.site.xo or 8052-site.xo)

Should I be connecting to specific IP address? (e.g. All OpenFLow switches are always allocated a specific host address on a subnet, same rule for management switch)

I have searched the wiki.exogeni.net site trying to figure this answer, and have found:

  • searching for instances of "8264" returns two pages that show the RENCI switch, but one shows "8264.renci.xo" and a second shows "rci-8264.renci.xo" tried both, neither works.
  • searched for instances of the RENCI subnet "192.168.103." and did not find any instance of a switch address.

comment:7 Changed 10 years ago by jonmills@renci.org

Luisa, you asked for a naming convention for the next sites. I quoted you the DNS configuration for all private subnets, which shows how the naming convention works.

I'm not certain how to help you. The name of the ticket implies that you merely want to ensure that you can resolve the names of the BBN switches from the BBN head node. That works.

But now you want to try other racks too, and I'm guessing you want to validate all DNS records, not just switches. But you don't know what the names of devices should be, and so you can't rightly validate them. I suppose the only thorough way for you to achieve this would be to read the master zone records and verify that it all looks good from each of the head nodes. If that isn't your plan, please advise.

comment:8 Changed 10 years ago by lnevers@bbn.com

The test goal is very simple, to get administrative access to each device in the rack. I simply want to login to each device and get access to administrative feature (e.g. sudo on Worker nodes, enable shell on switches, etc.".

When I asked about naming convention, I meant for the switches in the rack. I was able figure out that the Work Nodes are being named "site-w#" (where site is the sitename and # is 1-10) from /etc/hosts, but I have not been able to determine what is naming the switches. I was able to determine the BBN switch names (bbn-8052.bbn.xo and bbn-8264.bbn.xo) only because others at GPO could tell me what they were named.

This step will be repeated at each new site and it would be really helpful if there a naming convention that is implemented consistently. This will ease testing and supportability.

comment:9 Changed 10 years ago by jonmills@renci.org

Owner: changed from somebody to jonmills@renci.org
Status: newassigned

So as far as I know, our naming convention has been completely consistent since the BBN rack. The RENCI rack, however, predates some of our naming conventions. That is why it is "renci.xo" and not "rci.xo".

As to the naming of the switches, the DNS A record is always "site-####", where #### is like 8052 or 8264. However, for sake of convenience, we also make a DNS CNAME that points "####.site.xo" to "site-####.site.xo". But this is all very clear by doing a couple of DNS queries:

[root@fiu-hn ~]# nslookup fiu-8052.fiu.xo Server: 127.0.0.1 Address: 127.0.0.1#53

Name: fiu-8052.fiu.xo Address: 192.168.105.2

[root@fiu-hn ~]# nslookup 8052.fiu.xo Server: 127.0.0.1 Address: 127.0.0.1#53

8052.fiu.xo canonical name = fiu-8052.fiu.xo. Name: fiu-8052.fiu.xo Address: 192.168.105.2

comment:10 Changed 10 years ago by lnevers@bbn.com

I see, I had not expected the mix of old and new, "rci-8052.renci.xo".

Although I had only been been trying to only ssh to the switch hostname, but it turns out that even though a hostname resolves I cannot ssh or ping the resolvable hostname:

[lnevers@rci-hn ~]$ nslookup 8052.renci.xo
Server:		127.0.0.1
Address:	127.0.0.1#53

8052.renci.xo	canonical name = rci-8052.renci.xo.
Name:	rci-8052.renci.xo
Address: 192.168.102.2

[lnevers@rci-hn ~]$ ssh 8052.renci.xo
ssh: Could not resolve hostname 8052.renci.xo: Name or service not known
[lnevers@rci-hn ~]$ ping 8052.renci.xo
ping: unknown host 8052.renci.xo

Of course, I can access the IP address.

comment:11 Changed 10 years ago by jonmills@renci.org

It appears the name service caching daemon (nscd) was running on rci-hn, and was messing you up. I've cleared out the nscd cache, and disabled the daemon, since we don't need it.

Now I get:

[root@rci-hn ~]# ping 8052.renci.xo PING rci-8052.renci.xo (192.168.102.2) 56(84) bytes of data. 64 bytes from rci-8052.renci.xo (192.168.102.2): icmp_seq=1 ttl=255 time=0.490 ms 64 bytes from rci-8052.renci.xo (192.168.102.2): icmp_seq=2 ttl=255 time=0.514 ms 64 bytes from rci-8052.renci.xo (192.168.102.2): icmp_seq=3 ttl=255 time=0.527 ms 64 bytes from rci-8052.renci.xo (192.168.102.2): icmp_seq=4 ttl=255 time=0.534 ms

comment:12 Changed 10 years ago by lnevers@bbn.com

I am now able to login to both rci-8052.renci.xo and rci-8264.renci.xo.

Thank you, closing ticket!

comment:13 Changed 10 years ago by lnevers@bbn.com

Resolution: fixed
Status: assignedclosed
Note: See TracTickets for help on using tickets.