Not resolving host names on a local network? (Hint: I *think* it's IPv6's fault.)

A

Albert@@

I had a customer with this situation and it took me several visits to stumble upon what I think is the solution. The answer was in front of my face the whole time! I'm posting this because I am uncertain as to why things are now working this way.


Have you had a situation where a user, or yourself, can no longer connect to a local server, or other devices on a local network? You soon discover that it is a name resolution problem. You find that the server is up and running. From the server, it has no problem resolving names of devices on the network. (I should mention that this server is a domain controller that happens to also be the DNS server -- needed for Active Directory. It also happens to be a single server domain -- a subtle, but crucial point when it comes to resolving host names on a local network.) You can ping its IP address just fine from the client PC. It's just that, for some reason, the host name fails to resolve. DNS settings are as they've always been. Internet names resolve okay. But, if you were to do an nslookup of your server from the client PC, you'll get the alarming message: "*** UnKnown can't find OURSERVER-YO: Non-existent domain". "But, I can ping it!", you silently scream inside your head.


Add this to the mystery: When I entered the nslookup utility and specified the DNS server's IP address directly, the name would resolve! I went back and triple checked the IP settings for the client PC. Sure enough, the DNS settings are set correctly. What-the-heck man?


I did many of the usual DNS troubleshooting steps I've used for years including looking for malicious proxies and malware scans. I even went old school and added the host name to the hosts file in the etc directory. I tried just about every reasonable procedure I could find on the Internet. At best, the situation would be solved only temporarily. Typically after a reboot, name resolution would fail again and I would get the call.


On my last visit, I went through the same routine. I did another nslookup and got the same message above. I stared at the screen for a while trying to figure out what to do next. I then looked at the screen and realized that there was something missing. More so in recent years, an nslookup would first try an IPv6 address of *some* DNS server. When that failed to connect, it would then switch to the IPv4 address of the DNS server to resolve names. For some reason, these Windows PCs were no longer even *trying* to resolve a local network host name using the DNS' IPv4 address even though all the devices, on this small network, had the IPv4 settings set manually! (I'll get back to DHCP shortly.)


Okay. I decided to focus on that IPv6 address. (I should note that I've always ignored IPv6 -- to my detriment. IPv4 worked perfectly well for small networks -- until now.) Unsurprisingly, the IPv6 address setting of the DNS server on the client PC did not match the IPv6 address of the local domain's DNS server. In the past, that was fine. Windows would first try the IPv6 DNS setting on the client PC, then switch to the IPv4 when it failed to connect. Here's what I think is different: The Windows PCs were not failing to connect to that IPv6 address of *some* DNS server! That's why it had no problem resolving Internet names, but could not resolve local domain names. I then went into the IPv6 settings and manually set the IPv6 addresses of the local domain's DNS server. Problem was finally solved! Nslookup was very happy to report that: "Yes indeed, I have an IP address for you!" After a few test reboots, name resolution continues to work as planned.


Here's what I think happened. Let me know about your ideas of why Windows is no longer even trying to resolve names using the IPv4 address of the local domain's DNS server.


The customer has a very small domain network with about a dozen devices total -- including PCs, the server, routers, printers, etc. We decided to set private (of course) IP addresses statically on all devices that are to be members of the local domain. There is a separate router used for guests to access the Internet with its own, separate private IP space. The router is then connected to its own, (again), separately configured port on the firewall/gateway. The guest network is issued IP addresses through that router's DHCP server. The domain controller/server does not have DHCP enabled -- I now think that's probably at the core of the issue. Finally, all the internal traffic flows through the same switch. When Windows boots up, at some point, the IP stack loads up with all its settings. In the case of IPv4, they are set manually. IPv6, is by default, set up dynamically by what ever DHCP server it finds. (In this case, it's the guest router.) Unless the DHCP server is configured with the correct IPv6 DNS settings of the local server, it naturally won't be able to resolve local domain host names. (So, look at *all* of your DHCP server configs people!)


So, why did this suddenly start happening? Why is Windows, seemingly, ignoring the IPv4 address of the DNS server? Was there a recent Windows Update to the TCP/IP stack? Was that update not recent at all? It just so happened that, in the past, when Windows failed to connect to an IPv6 address for resolution, it switched to IPv4. Why is IPv6 suddenly successfully connecting to *some* DNS server when it didn't in the past? Maybe the protocol was designed to not to even try IPv4 if can successfully connect through IPv6. Which means, we can no longer ignore IPv6. Like it or not, needed or not, we must come to terms with this beast.

Continue reading...
 
Back
Top Bottom