Lost connection to Active Directory

jleomcdo
Contributor

I'm starting to see an issue with our Mac's (bond to AD) will lose their connection to AD. Here are the symptoms that I notice when I start having odd issues:
My wireless will not connect. (We use Computer Authentication, which requires your Mac to be bond to our AD) My Domain admin account will no longer be able to "unlock" preferences or do any admin task.
If I try to use dscl to browse AD, I'm able to do a "ls" at the top level and see "/Active Directory" and then cd (change directory) to /Active Directory. Next I do "ls" again and see our domain LPCDOMAIN1, but I can't change directory to it. It will give me an error message. (sorry I don't have that wrote down)

Troubleshooting step:
When I check the "Login Options" under Uesr&Groups, it show that I'm joined to AD and will list my domain name and the green light.
I'm able to find my computer name in AD, when searching with "MS Active Directory Users and Computers" tool.
My Search Path will show /Local/Default and /Active Directory
I'm able to ping my DC by IP and name.
It acts like the mac is bond to AD, but can't talk to it.

Work around:
Unbind from AD
Rebind to AD
Reboot

I'm wondering if anyone has seen something like this. This has only happened on a few Macs and all of them were running 10.10.2.
Most of our Mac's are still on 10.9.5 and never experienced this issue.

34 REPLIES 34

bpavlov
Honored Contributor

Yes, it's a common issue if a computer stops communicating with the domain controller (particularly on laptops where the user may rely on wireless for the most part). It's on my to do list to have an extension attribute that checks the status of the computer's binding and if it can't communicate then attempt to rebind. Perhaps someone may have something like that already and would be willing to share, but you'd definitely have to tweak it to your environment.

mm2270
Legendary Contributor III

Curious, but is this happening on Macs you use regularly and are connected to your internal network? They aren't Macs that are sitting in a drawer or in a storage shelf somewhere for awhile?

jleomcdo
Contributor

One of the Mac's that had the issue was my MacBook Pro that I use everyday. This is what stumped me. I can see if it was off line for awhile.

scottb
Honored Contributor

Two things that are what we check first with this:

1) Clock. Make sure it's not >5 mins off from AD.
2) Check Active Roles to see of the Mac has moved to disabled or other group that would kill functionality.

98% of the issues like that are fixed with those two items.

nessts
Valued Contributor II

We run a tool that verifies the binding to AD every time the computer boots as well, if it thinks it is not bound it re-binds to AD. Sometimes the computer password does not get updated in AD, and looses authentication.

CGundersen
Contributor III

dsconfigad -passinterval? Other patterns (e.g. number of days before connectivity problem)?

As was mentioned time skew and disabled/tombstoned computer accounts perhaps?

802.1x with Yosemite has not been fruitful for us.

alexjdale
Valued Contributor III

We have an extension attribute for AD checks that does two things: runs an "id" on a test user account we have (to see if the LDAP query succeeds) and also checks the System keychain for the Active Directory password entry for the computer account.

One of the bugs we see relatively commonly when there is an AD bind issue is that the AD password disappears from the System keychain for some reason. This also happens sometimes during the bind, and the password entry is simply not added at all.

Something like this:

#!/bin/bash

# AD Bind Check Extension Attribute

adName=`dsconfigad -show | grep "Computer Account" | awk '{print toupper}' | awk '{print $4}' | sed 's/$$//'`

if [ ! "$adName" ]; then
    echo "<result>Not Bound</result>"
    exit 0
else
    result1="Bound as $adName"
fi

ldapTest=`id TESTACCOUNTNAME | grep UIDOFTESTACCOUNT`
if [ ! "$ldapTest" ]; then
    result2="LDAP Query Failed"
else
    result2="LDAP Query OK"
fi

keychainTest=`security find-generic-password -l "/Active Directory/YOURDOMAINNAME" /Library/Keychains/System.keychain`
if [ ! "$keychainTest" ]; then
        result3="AD Password Missing"
    else
        result3="AD Password OK"
fi

echo "<result>$result1 - $result2 - $result3</result>"

mm2270
Legendary Contributor III

We have a similar EA that does an Active Directory join verification. It also looks for the AD system keychain entry and does a look up against its own Computer record in AD.

jleomcdo
Contributor

Thanks for all the information. I never thought about checking the keychain for the AD password. I will make a note to check this, the next time the problem comes up.

I did test the "id" command against my domain account and that did work. Weird...

nessts
Valued Contributor II

it is not a password stored in keychain, its part of the AD record, its not a real password at all and you cannot check for it. You have to know if the computer password needs to change weekly and use the passinterval to set your binding up properly if it needs to change more often than the default of 15 days I think. Also some AD environments do not require it to change, and work worse if you do have it set to change.

And like has been noted sometimes the AD plugin just stops talking and you need to rebind. so coming up with a tool like above is helpful to resolve those situations.

alexjdale
Valued Contributor III

The AD password for the computer is most certainly stored in the System keychain, as an application password. Computers have passwords just like users do.

You can reveal that password in Keychain Access and use it to get a kerberos ticket for your computer's AD account if you wanted to.

bentoms
Release Candidate Programs Tester

@jleomcdo FWIW we set "passinterval" to 0 so our Mac clients never update/change their password.

jhalvorson
Valued Contributor

@bentoms Is there a requirement to set the passinterval before the computer is bound to AD or can it be done after it's bound.

I currently use the JSS built-in directory binding with Casper Imaging. It's my observation with 9.65 that the binding can take place before any "install on boot drive after imaging" packages or "at reboot" scripts take place. Would I need to go back to scripting the bind process with a custom trigger to control the order: set the passinterval and then bind?

bentoms
Release Candidate Programs Tester

@jhalvorson change it post binding, add a script to the build & have that run "AFTER" & "AT REBOOT" that should then run "AFTER" the binding.

jhalvorson
Valued Contributor

@bentoms I located the Apple KB that gave me the impression the passinterval should be set prior to the time of binding. Changing the password expiration time for an Active Directory client It's possible that Apple wrote the directions this way to cover both a broken bound device, the solution, and rebinding all in one step.

I am on your side and based on experience, the value is honored if it is set after binding. I could test by setting it to 1 day and leaving a device in a drawer over the weekend.

jellingson
New Contributor II

We have around 70 macs in our environment and in the past 3 or 4 months have seen this happen 3 or 4 times, all on different machines. The strange part is that from almost every aspect it looks as though the mac and the server are still communicating and connected properly. In Users & Groups preference pane the domain is shown with a green light, the Active Directory entry is still shown in the keychain, running dsconfigad shows proper name and domain, the server side listing shows a recent last logon entry, are able to ping the domain controller from the affected machine, but when running "id ACCOUNT" command with a known working account it comes back no such user, and if we try to unbind and rebind it gives the "Unable to access domain controller" and the option to force unbind. Doing a force unbind and deleting the computer entry from the server and rebinding fixes the problem, but we would like to find a way to possibly prevent the issue. If not we will attempt to set up an extension attribute to do a rebind if this happens. Any suggestions would be greatly appreciated

clarkep
New Contributor III

We are experiencing this EXACT thing in 2022. Have you found a solution to this (7 years after posting....?)

When you need IT...get PJ. C. Working as a tech in a private school for over 15 years.

davidacland
Honored Contributor II
Honored Contributor II

Most of the indicators (dsconfigad -show, system preferences etc) aren't showing the actual state of the connection unfortunately.

If you haven't set it already, I would try setting the computer password interval to 0 (dsconfigad -passinterval 0) and running the free centrify AD check tool to see if it highlights any issues.

jellingson
New Contributor II

@davidacland do you have a link to the AD Check tool. I can't seem to find in on the Centrify website or on google anywhere

davidacland
Honored Contributor II
Honored Contributor II

@jellingson You can get it as part of Centrify Express here: http://www.centrify.com/express/identity-service/mac-download/

jellingson
New Contributor II

Running the AD Check tool returns a pass on all tests

bbot
Contributor

We see the same thing here. I use a script that checks to see if the keychain exists, and that it can use dscl to view the computer object. If any of those returns false, it force unbinds, then rebinds to AD.

ooshnoo
Valued Contributor

@bbot

Can you post that script? Please?

bbot
Contributor

We use script parameters so that passwords aren't in plain text. Also I've found that force unbinding twice seemed to have better results. When we did one unbind, the script would get stuck and exit out.

#!/bin/sh
# HARDCODED VALUES ARE SET HERE
Pass=""

# CHECK TO SEE IF VALUES WERE PASSED FOR $4, AND IF SO, ASSIGN THEM
if [ "$4" != "" ] && [ "$Pass" == "" ]; then 
Pass=$4
fi

# Check to make sure Pass variable was passed down from Casper
if [ "$Pass" == "" ]; then 
echo "Error: The parameter 'Pass' is blank. Please specify a value." 
exit 1 
fi

##Check if Mac is on the network
if ping -c 2 -o domaincontroller.company.com; then

    if [[ $(dsconfigad -show | awk '/Active Directory Domain/{ print $NF }') == "domaincontroller.company.com" ]]; then
        ADCompName=$(dsconfigad -show | awk '/Computer Account/{ print $NF }')
        ## Mac has correct dsconfigad info

            security find-generic-password -l "/Active Directory/Domain" | grep "Active Directory"
            if [ "$?" == "0" ]; then
                ## AD keychain entry exists

                dscl "/Active Directory/Domain/All Domains" read /Computers/"$ADCompName" | grep -i "$ADCompName"
                if [ "$?" == "0" ]; then
                    ## Found AD entry. Binding is good
                    res="Mac is already bound"
                else
                    res="Not bound"
                fi
            else
                res="Not bound"
            fi
    else
        res="Not bound"        
    fi
    else

        res="Not bound"
fi
    ## Mac is not on the network

echo "<result>$res</result>"

# reset the time from the domain, then force unbind if bound

if [[ $res == "Not bound" ]]; then

    /usr/sbin/systemsetup -setusingnetworktime off
    /usr/sbin/systemsetup -setnetworktimeserver "domaincontroller.company.com"
    /usr/sbin/systemsetup -setusingnetworktime on
    sleep 10

    /usr/sbin/dsconfigad -remove -force -username macimaging -password $4 
    sleep 10
    echo "Unbinding"
    killall opendirectoryd
    sleep 5

## Testing has shown that unbinding twice may be necessary. 
    /usr/sbin/dsconfigad -remove -force -username macimaging -password $4 &> /dev/null
    sleep 10
    echo "Unbinding twice just incase"

## Begin rebinding process

    #Basic variables
    computerid=`scutil --get LocalHostName`
    domain=domaincontroller.company.com
    udn=account
    ou="CN=Computers,DC=DOMAIN,DC=DOMAIN,DC=us"

    #Advanced variables
    alldomains="disable"
    localhome="enable"
    protocol="smb"
    mobile="enable"
    mobileconfirm="disable"
    user_shell="/bin/bash"
    admingroups="CorpHelpdesk"
    namespace="domain"
    packetsign="allow"
    packetencrypt="allow"
    useuncpath="disable"
    passinterval="90"

    # Bind to AD
    /usr/sbin/dsconfigad -add $domain -alldomains $alldomains -username $udn -password $4 -computer $computerid -ou "$ou" -force -packetencrypt $packetencrypt
    sleep 1
    echo "Rebinding to AD and setting advanced options"

    #set advanced options
    /usr/sbin/dsconfigad -localhome $localhome
    sleep 1
    /usr/sbin/dsconfigad -groups "$admingroups"
    sleep 1
    /usr/sbin/dsconfigad -mobile $mobile
    sleep 1
    /usr/sbin/dsconfigad -mobileconfirm $mobileconfirm
    sleep 1
    /usr/sbin/dsconfigad -alldomains $alldomains
    sleep 1
    /usr/sbin/dsconfigad -useuncpath "$useuncpath"
    sleep 1
    /usr/sbin/dsconfigad -protocol $protocol
    sleep 1
    /usr/sbin/dsconfigad -shell $user_shell
    sleep 1
    /usr/sbin/dsconfigad -passinterval $passinterval
    sleep 1

    #dsconfigad adds "All Domains"
    # Set the search paths to "custom"
    dscl /Search -create / SearchPolicy CSPSearchPath
    dscl /Search/Contacts -create / SearchPolicy CSPSearchPath

    sleep 1

    # Add the "XXXXX.us" search paths
    dscl /Search -append / CSPSearchPath "/Active Directory/CORP/domaincontroller.company.com"
    dscl /Search/Contacts -append / CSPSearchPath "/Active Directory/CORP/domaincontroller.company.com"

    sleep 1

    # Delete the "All Domains" search paths
    dscl /Search -delete / CSPSearchPath "/Active Directory/CORP/All Domains"
    dscl /Search/Contacts -delete / CSPSearchPath "/Active Directory/CORP/All Domains"

    sleep 1

    # Restart opendirectoryd
    killall opendirectoryd
    sleep 5
else
    echo "Mac is already bound. Exiting."
fi

exit 0

@ooshnoo

ooshnoo
Valued Contributor

@bbot

Thanks! Much appreciated!

Aziz
Valued Contributor

I'll chine in alongside @bbot

@ooshnoo

We use an Extension Attribute and we call it "Check Active Directory Health". It just checks to see if AD is reachable. If not, the Mac falls into a Smart Group.

The Smart Group has a policy scoped to it that updates the Mac's time to match NTP, then unbinds and rejoins it to AD.

Here's the EA:

#!/bin/bash

domain=""
user="insert any AD user here"

# Can we query a UPN?
domainAns=`dscl /Active Directory/${domain}/All Domains -read /Users/${user} dsAttrTypeNative:userPrincipalName`
if [[ $domainAns =~ "is not valid" ]]; then
    result="Invalid"
else
        result="Valid"
fi

echo "<result>$result</result>"

mcmaddog
New Contributor

I know this is an old thread, but I saw that behavior on machines that were upgraded to 10.10.x. Computers with fresh installs of 10.10.x would stay bound, but any machine upgraded from a previous OS would keep unbinding itself.

I haven't seen this happen now that we are upgrading machines to 10.11.x

msnowdon
Contributor

@bentoms @jhalvorson I know this is old but ever since we moved to 8021x authentication, this problem has been becoming more popular on our El Capitan machines. I was wondering if the command to disable the password change interval ( dsconfigad -passinterval X) needs to be run prior to or after the domain binding. @jhalvorson , the Apple article you mentioned instructs you to do it prior to binding but @bentoms said it works after binding.

Thanks

jhalvorson
Valued Contributor

As best I can tell, when the computer is not bound, there aren't any configs to adjust.
When you attempt to set it on a computer that is is not bound, the response is:

dsconfigad: No operation specified nor update requested

I have been issuing the command after the computer has been bound to AD. Then the command will result in:

Settings changed successfully.

You can see the status of the dsconfigad by using the

dsconfigad -show

command. Here's an example:

Active Directory Forest = mydomain.org Active Directory Domain = mysomething.mydomain.org Computer Account = ComputerID$ Advanced Options - User Experience Create mobile account at login = Enabled Require confirmation = Disabled Force home to startup disk = Enabled Mount home as sharepoint = Enabled Use Windows UNC path for home = Disabled Network protocol to be used = smb Default user Shell = /bin/bash Advanced Options - Mappings Mapping UID to attribute = not set Mapping user GID to attribute = not set Mapping group GID to attribute = not set Generate Kerberos authority = Enabled Advanced Options - Administrative Preferred Domain controller = not set Allowed admin groups = not set Authentication from any domain = Enabled Packet signing = allow Packet encryption = allow Password change interval = 30 Restrict Dynamic DNS updates = not set Namespace mode = domain

msnowdon
Contributor

I was working on a script to unbind and rebind a mac to our domain. When I run dsconfigad -show on some existing computers that are already bound to AD, some computers have Packet signing and Packet encryption as "allow" and some have it as "disable." Now Im not sure which option to use in the script. I'm not exactly sure what these settings do.

Also when I add groups to Allowed Admin groups in the script, I try to add 3 groups as admingroups="domain admins, enterprise admins, tier2-support" as the variable and use /usr/sbin/dsconfigad -groups $admingroups as the command. It doesnt seem to like the space in the group name because it ends up adding just "domain" in the Admin groups. Do I need another set of parentheses or brackets?

Thanks

agerson
New Contributor III

This issue has plagued us for years and still does on 10.13.5 Thanks for these helpful scripts. Hopefully, they will work as a band-aid.

clarkep
New Contributor III

Hey Adam, looks like I found you on this ancient thread! We are still suffering this issue worse than ever. Did you find a solution or move to Jamf Connect? What's interesting is that our machines are becoming "unbound" they seem to be still bound, but unable to communicate with the domain controller. Still scratching our heads and Apple has no idea.

When you need IT...get PJ. C. Working as a tech in a private school for over 15 years.

agerson
New Contributor III

It still happens periodically, but it's not at epidemic proportions so we just live with it. What Mac OS are you on? We are on 12.5.1 for our entire fleet. I have a theory that it may have to do with a loss of internet blip at the wrong time. Also, we learned the hard way that AD truncates computer names after a certain number of characters (I don't remember how many). So if you have a naming scheme like Building36-Lab3-Computer-1 it will truncate and when you add Building36-Lab3-Computer-2 it will overwrite the AD record for Building36-Lab3-Computer-1 (which was probably stored as Building36-Lab3-Com) and break the AD connection for the first machine. 

 

We tried JAMF connect, but we are a Google school and JAMF connect does not react well to password changes when using Google as the auth source so that was a deal breaker for us. 

clarkep
New Contributor III

That's interesting about the network blip that could be causing that. We manually rebound a bunch of laptops before deployment and found that after they were shut down for an hour and started up again, they weren't communicating with AD again. We use an AD name that is less than 15 characters so we don't run into the truncated name scenario. We are really feeling the pain with the AD stuff now because we rely on it for authenticated printing, lightspeed and getting wifi access of course. 

That is not great to hear about Jamf Connect, because Google would be the next logical step for authentication since we use it for almost everything else here at school. 

When you need IT...get PJ. C. Working as a tech in a private school for over 15 years.