Posted on 09-11-2015 07:36 AM
Hello,
I manage about 200 AD bound macs (stock binding, no centrify or other tools) in various locations and first noticed the issue on stations closest to a cluster of printers (I believe they experience the most logins). Seemingly at random, machines will lock up after a user has entered their username and password and attempted to login. With the two fields still on the screen a grey spinning wheel appears and will remain indefinitely.
Interestingly, while keeping an eye on them through our management tool, I have also noticed that when they appear offline they will sometimes be at the login screen with a moving cursor and keyboard entry. However, if I attempt to log in it will lock up at the screen I usually see. Furthermore, if I attempt to restart those machines instead of logging in the screen goes black and the mouse cursor remains indefinitely. I'm led to believe that whatever the issue is it occurs while a machine idles at the login window.
I've attempted SMC resets, reimaged when I first saw the problem, and collected some logs, but I couldn't determine anything definitive from them.
Has anyone seen similar behavior or have some thoughts on my issue?
Posted on 09-11-2015 09:15 AM
So I probably wouldn't characterize our issue as "frozen" but we have seen noticeable delays on the login screen on 10.10.5 Macs, particularly the 2015 Retina MBP build machines I've been deploying. It'll sit on the login screen for 2+ minutes before proceeding. In most cases sudo defaults write /Library/Preferences/com.apple.loginwindow DSBindTimeout -int 2
seems to fix it, but it's not consistent.
Posted on 09-12-2015 07:05 PM
I have had this issue in the last week on 2 separate macs. One is a brand new mac mini and the other was a newer (less than 2 years old) 27" iMac. One was bound to AD the other was not. I have a ticket open with Apple EDU support and engineering has it. They asked if you can Target Disk mode and get the /Priv/Var sys logs and give it to them. The only option i had was to time machine restore on one and a full erase and reinstall of the OS on the other.
Posted on 09-12-2015 07:09 PM
I also used to have this problem but it was only on my machines that were running 10.10.3 once i upgraded the problem was resolved. Originally i had entire labs that would not get past the white loading screen. There was a problem with the OS being bound to AD. If they were not bound to AD the problem wouldn't happen. Usually putting them in safe mode, than rebooting would solve the problem.
Posted on 09-14-2015 07:29 AM
Can you try unbinding one and rebinding one as a test?
Posted on 09-14-2015 08:28 AM
I attempted sudo defaults write /Library/Preferences/com.apple.loginwindow DSBindTimeout -int 2
on 10 machines, but the problem has reoccurred on some of them.
Josh, I've been reimaging some of them here and there since it can buy me a week or two before it starts happening again. I suppose I'll reach out to their support as well to see if they can help.
I've looked at the standard logs from 5 different machines and the common error between them is this: kernel[0] kauth external resolver timed out (1 timeout(s) of 60 seconds). I haven't been able to figure out exactly what that means yet.
Posted on 09-14-2015 09:03 AM
Can you log in with a local account just to rule out the possibility of it being the machine?
Is this happening at initial login or when the user is trying to log back into the machine when waking from sleep?
How many entries do you have listed under "Search Policy" in the Directory Utility?
Posted on 09-14-2015 09:32 AM
When I have seen this error, it is usually due to the machine caching an AD server that was not optimal. Try binding a machine to the DC you know is most accessible and seeing if the issue reoccurs.
Posted on 03-11-2016 02:37 PM
I've been having the same problem since upgrading our systems to 10.10.5 in January. Your post is the first online corroboration I've found, and your description is spot on.
In our case, we have a mix of Open Directory and Active Directory bound systems, but all of them are on 10.10.5 and I have seen the issue appear, seemingly at random, with a handful of systems (out of roughly 70) every single day. In every instance, the system can be pinged, but is unresponsive to ARD or SSH, and the only way to recover is to press and hold the power button to shut it down and then start it back up.
The symptoms I've observed match yours:
frozen system with "spinning wheel" superimposed over the login window
frozen system with immovable mouse pointer at the login window
keyboard and mouse are useable at the login window, but if I try to log in, even with a local admin account, I get the spinning beach ball
In every case, I also see this log entry: kauth external resolver timed out (1 timeout(s) of 60 seconds)
It occurred to me that one change in OS X 10.10.x is that the system will no longer shut down at a scheduled time if it's at the login window, i.e. no one is logged in. (This is documented by Apple and discussed in at least a couple of threads in these forums. The logic of this change still escapes me.) Through 10.9.x we always scheduled the systems to shut down and then start up on the same schedule every day, but you can't do that any more in 10.10+, so they're basically on or asleep 24/7.
So I wondered if the problem might be related to a system potentially sitting idle without a user logging into it for long periods of time without a restart or power cycle. These are all lab/classroom computers, and it's possible that a given system could go unused for a matter of days.
So I changed the power schedule to sleep and wake the computers (instead of shut down and power on, since that was now useless anyway) and then pushed out a LaunchDaemon to restart the systems shortly after their scheduled wake up -- and today none of them are showing the problem so far. It's too soon to tell if this is a fix, but I'll monitor it and report back what I find.
Posted on 03-22-2016 09:41 AM
Following up: It's been 11 days since I changed our daily shutdown and startup schedule to a daily sleep and wake schedule (via pmset) and also added a daily system restart (via LaunchDaemon) shortly after the wake time, and I have not seen this issue recur on a single system.