Troubleshoot and Solve Domain-Join Problems
Review the sections in this chapter to resolve domain-join problems.
Top 10 Reasons Domain-Join Fail
Here are the top 10 reasons that an attempt to join a domain fails:
- Root was not used to run the domain-join command (or to run the domain-join graphical user interface).
- The user name or password of the account used to join the domain is incorrect.
- The name of the domain is mistyped.
- The name of the OU is mistyped.
- The local hostname is invalid.
- The domain controller is unreachable from the client because of a firewall or because the NTP service is not running on the domain controller.
- Make Sure Outbound Ports are Open at Perform Basic Troubleshooting for the AD Bridge Agent
- Diagnose NTP on Port 123
- The client is running RHEL 2.1 and has an old version of SSH.
- On SUSE, GDM (dbus) must be restarted. This daemon cannot be automatically restarted if the user logged on with the graphical user interface.
- On Solaris, dtlogin must be restarted. This daemon cannot be automatically restarted if the user logged on with the Solaris graphical user interface. To restart dtlogin, run the following command:
- SELinux is set to either enforcing or permissive, likely on Fedora. SELinux must be set to disabled before the computer can be joined to the domain.
To turn off SELinux, please see the SELinux man page.
Solve Domain-Join Problems
To troubleshoot problems with joining a Linux computer to a domain, perform the following series of diagnostic tests sequentially on the Linux computer with a root account.
The tests can also be used to troubleshoot domain-join problems on a Unix computer; however, the syntax of the commands on Unix might be slightly different.
The procedures in this topic assume that you have already checked whether the problem falls under the Top 10 Reasons Domain Join Fails (see above). We also recommend that you generate a domain-join log.
For more information, please see Generate a Domain-Join Log for AD Bridge.
Verify that the Name Server Can Find the Domain
Run the following command as root:
Make Sure the Client Can Reach the Domain Controller
You can verify that your computer can reach the domain controller by pinging it:
Check DNS Connectivity
The computer might be using the wrong DNS server or none at all. Make sure the nameserver entry in /etc/resolv.conf contains the IP address of a DNS server that can resolve the name of the domain you are trying to join. The IP address is likely to be that of one of your domain controllers.
Make Sure nsswitch.conf Is Configured to Check DNS for Host Names
The /etc/nsswitch.conf file must contain the following line. (On AIX, the file is /etc/netsvc.conf.)
hosts: files dns
Computers running Solaris, in particular, may not contain this line in nsswitch.conf until you add it.
Ensure that DNS Queries Use the Correct Network Interface Card
If the computer is multi-homed, the DNS queries might be going out the wrong network interface card.
Temporarily disable all the NICs except for the card on the same subnet as your domain controller or DNS server and then test DNS lookups to the AD domain.
If this works, re-enable all the NICs and edit the local or network routing tables so that the AD domain controllers are accessible from the host.
Determine If DNS Server Is Configured to Return SRV Records
Your DNS server must be set to return SRV records so the domain controller can be located. It is common for non-Windows (bind) DNS servers to not be configured to return SRV records.
Diagnose it by executing the following command:
nslookup -q=srv _ldap._tcp. ADdomainToJoin.com
Make Sure that the Global Catalog Is Accessible
The global catalog for Active Directory must be accessible. A global catalog in a different zone might not show up in DNS. Diagnose it by executing the following command:
nslookup -q=srv _ldap._tcp.gc._msdcs. ADrootDomain.com
From the list of IP addresses in the results, choose one or more addresses and test whether they are accessible on Port 3268 using telnet.
telnet 192.168.100.20 3268 Trying 192.168.100.20... Connected to sales-dc.example.com (192.168.100.20). Escape character is '^]'. Press the Enter key to close the connection: Connection closed by foreign host.
Verify that the Client Can Connect to the Domain on Port 123
The following test checks whether the client can connect to the domain controller on Port 123 and whether the Network Time Protocol (NTP) service is running on the domain controller. For the client to join the domain, NTP, the Windows time service, must be running on the domain controller.
On a Linux computer, run the following command as root:
ntpdate -d -u DC_hostname
ntpdate -d -u sales-dc
For more information, please see Diagnose NTP on Port 123
In addition, check the logs on the domain controller for errors from the source named w32tm, which is the Windows time service.
FreeBSD: Run ldconfig If You Cannot Restart Computer
When installing AD Bridge on a new FreeBSD computer with nothing in /usr/local, run /etc/rc.d/ldconfig start after the installation if you cannot restart the computer. Otherwise, /usr/local/lib will not be in the library search path.
Ignore Inaccessible Trusts
An inaccessible trust can block you from successfully joining a domain. If you know that there are inaccessible trusts in your Active Directory network, you can set AD Bridge to ignore all the trusts before you try to join a domain. To do so, use the config tool to modify the values of the DomainManagerIgnoreAllTrusts setting.
- List the available trust settings:
/opt/pbis/bin/config --list | grep -i trust
The results will look something like this. The setting at issue is DomainManagerIgnoreAllTrusts
DomainManagerIgnoreAllTrusts DomainManagerIncludeTrustsList DomainManagerExcludeTrustsList.
- List the details of the DomainManagerIgnoreAllTrusts setting to see the values it accepts:
[root@rhel5d bin]# ./config --details DomainManagerIgnoreAllTrusts Name: DomainManagerIgnoreAllTrusts Description: When true, ignore all trusts during domain enumeration. Type: boolean Current Value: false Accepted Values: true, false Current Value is determined by local policy.
- Change the setting to true so that AD Bridge will ignore trusts when you try to join a domain.
[root@rhel5d bin]# ./config DomainManagerIgnoreAllTrusts true
- Check to make sure the change took effect:
[root@rhel5d bin]# ./config --show DomainManagerIgnoreAllTrusts boolean true local policy
Now try to join the domain again. If successful, keep in mind that only users and groups who are in the local domain will be able to log on the computer.
In the example output above that shows the setting's current values, local policy is listed, meaning that the setting is managed locally through config because an AD Bridge Group Policy setting is not managing the setting. Typically, with AD Bridge, you would manage the DomainManagerIgnoreAllTrusts setting by using the corresponding Group Policy setting, but you cannot apply Group Policy Objects (GPOs) to the computer until after it is added to the domain. The corresponding AD Bridge policy setting is named Lsass: Ignore all trusts during domain enumeration.
For information on the arguments of config, run the following command:
Resolve Common Error Messages
This section lists solutions to common errors that can occur when you try to join a domain.
Configuration of krb5
Warning: A resumable error occurred while processing a module. Even though the configuration of 'krb5' was executed, the configuration did not fully complete. Please contact BeyondTrust support.
Delete /etc/krb5.conf and try to join the domain again.
This error can occur when you try to join a domain or you try to execute the domain-join command with an option but the netlogond daemon is not already running.
Error: chkconfig failed [code 0x00080019]
Description: An error occurred while using chkconfig to process the netlogond daemon, which must be added to the list of processes to start when the computer is rebooted. The problem may be caused by startup scripts in the /etc/rc.d/ tree that are not LSB-compliant.
Verification: Running the following command as root can provide information about the error:
chkconfig --add netlogond
Remove startup scripts that are not LSB-compliant from the /etc/rc.d/ tree.
The following error might occur if there are replication delays in your environment. A replication delay might occur when the client is in the same site as an RODC.
Error: LW_ERROR_KRB5KDC_ERR_C_PRINCIPAL_UNKNOWN [code 0x0000a309] Client not found in Kerberos database [root@rhel6-1 ~]# echo $? 1 [root@rhel6-1 ~]# /opt/pbis/bin/domainjoin-cli query Error: LW_ERROR_KRB5KDC_ERR_C_PRINCIPAL_UNKNOWN [code 0x0000a309] Client not found in Kerberos database
After the error occurs, wait 15 minutes, and then run the following command to restart AD Bridge:
/opt/pbis/bin/lwsm restart lwreg
When you use the AD Bridgedomain-join utility to join a Linux or Unix client to a domain, the utility might be unable to contact the domain controller on Port 123 with UDP. The AD Bridge agent requires that Port 123 be open on the client so that it can receive NTP data from the domain controller. In addition, the time service must be running on the domain controller.
You can diagnose NTP connectivity by executing the following command as root at the shell prompt of your Linux computer:
ntpdate -d -u DC_hostname
ntpdate -d -u sales-dc
If all is well, the result should look like this:
[root@rhel44id ~]# ntpdate -d -u sales-dc 2 May 14:19:20 ntpdate: ntpdate email@example.com Thu Apr 20 11:28:37 EDT 2006 (1) Looking for host sales-dc and service ntp host found : sales-dc.example.com transmit(192.168.100.20) receive(192.168.100.20) transmit(192.168.100.20) receive(192.168.100.20) transmit(192.168.100.20) receive(192.168.100.20) transmit(192.168.100.20) receive(192.168.100.20) transmit(192.168.100.20) server 192.168.100.20, port 123 stratum 1, precision -6, leap 00, trust 000 refid [LOCL], delay 0.04173, dispersion 0.00182 transmitted 4, in filter 4 reference time: cbc5d3b8.b7439581 Fri, May 2 2008 10:54:00.715 originate timestamp: cbc603d8.df333333 Fri, May 2 2008 14:19:20.871 transmit timestamp: cbc603d8.dda43782 Fri, May 2 2008 14:19:20.865 filter delay: 0.04207 0.04173 0.04335 0.04178 0.00000 0.00000 0.00000 0.00000 filter offset: 0.009522 0.008734 0.007347 0.005818 0.000000 0.000000 0.000000 0.000000 delay 0.04173, dispersion 0.00182 offset 0.008734 2 May 14:19:20 ntpdate: adjust time server 192.168.100.20 offset 0.008734 sec
Output When There is No NTP Service
If the domain controller is not running NTP on Port 123, the command returns a response such as no server suitable for synchronization found, as in the following output:
5 May 16:00:41 ntpdate: ntpdate firstname.lastname@example.org Thu Apr 20 11:28:37 EDT 2006 (1) Looking for host RHEL44ID and service ntp host found : rhel44id.example.com transmit(127.0.0.1) transmit(127.0.0.1) transmit(127.0.0.1) transmit(127.0.0.1) transmit(127.0.0.1) 127.0.0.1: Server dropped: no data server 127.0.0.1, port 123 stratum 0, precision 0, leap 00, trust 000 refid [127.0.0.1], delay 0.00000, dispersion 64.00000 transmitted 4, in filter 4 reference time: 00000000.00000000 Wed, Feb 6 2036 22:28:16.000 originate timestamp: 00000000.00000000 Wed, Feb 6 2036 22:28:16.000 transmit timestamp: cbca101c.914a2b9d Mon, May 5 2008 16:00:44.567 filter delay: 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 filter offset: 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 delay 0.00000, dispersion 64.00000 offset 0.000000 5 May 16:00:45 ntpdate: no server suitable for synchronization found
Turn off Apache to Join a Domain
The Apache web server locks the keytab file, which can block an attempt to join a domain. If the computer is running Apache, stop Apache, join the domain, and then restart Apache.