@emilykausalik The Applecare Support Engineer I have been working with indicated the issue appears to be more prevalent with FV2 encrypted Macs but I have had no issues replicating the issue on Non-FV2 encrypted systems. Most of our reports in the wild have been from unencrypted desktops (most of our laptops are encrypted). I suspect this is because desktops are more likely to experience a power failure than laptops.
Hi All,
Just an idea, but when writing AutoCasperNBI myself & @neil.martin83 kept seeing a hang on restart until i deleted the below:
/usr/standalone/bootcaches.plist
If you have Macs that you can test & recreate this on, can someone try deleting that file & then breaking the Mac again.
This is just a theory, so please don't do it in prod!
Okay, so it worked once on one computer but now that one isn't working again. And another user is having the same issue and that fix didn't help. I tried @bentoms suggestion but didn't notice any change. I can boot into Safe Mode but can't get the OS to load otherwise. Super frustrating.
With the steps I outlined after hitting reboot in SU the initial boot will take 3-5 minutes in my experience but it will boot. Tech Ops also reported one system to me that the outlined procedure would not resolve the issue yesterday. However they formatted the system before I was back in the office today, I would have liked to have access to the "broken" system to work with our support contact on it but it is what it is. I have not been able to replicate, following the procedure to delete in SU mode works every time on my test systems.
Interestingly I just tried the "turn off Wifi in Recovery Mode" and that did work, so I'm starting to wonder if there's some kind of issue with FV2-enabled Yosemite Macs talking to the directory service while loading the OS/mobile profile. Since it bypasses (well, more hands-off than bypasses) the regular login screen I'm wondering if the sequence of events to check with the directory and load the mobile profile is busted somehow.
@emilykausalik][/url Yes, this has already been confirmed further up this thread, see my post.
And this thread:
https://jamfnation.jamfsoftware.com/discussion.html?id=12188
I'm not alone!!!
Sporadically over our 1500+ systems on 10.10.1 loosing power at one site or the other would result in ~%25 machines not able to boot - stuck at %50. Annoying - for me. Costly for our business. Yikes!
I've isolated down to the Apple AD plug-in being enabled. Solid troubleshooting over the last couple days - using 3 iMacs on a power bar and hundreds of reboots later and many many many clean installs later, I know the following.
Clean install of 10.10.1 with nothing else.
Enable active directory and bind it.
Boots to %50 and hangs following a forced power failure.
Has there been any response from Apple on this? Does anyone have access to the betas of 10.10.2 to see if this is still an issue? I am aware that NDA prevents discussion of specifics, I am just wondering if anyone is testing this. We have not seen a single occurrence in our testing (AD 2008). Granted we have mostly portables...
Same thing happening in our environment with ~600 Macs joined to Active Directory.
Can reproduce the 50% hang with a forced power off consistently.
99% of our Apple devices are laptops with AD auth and FV2 enabled. We've recently gone through a domain migration and of the few machines left over to migrate we haven't seen this issue with the Macs that aren't using the AD plugin to talk to the domain.
Tested following fixes to no avail:
- Changing AD authentication timeout values to low numbers (even 0) - as in this terminal line: sudo defaults write /Library/Preferences/com.apple.loginwindow DSBindTimeout -int <seconds>
- Unckecking the "Use UNC path from Active Directory to derive network home location"
- Installing 10.10.2 beta release on a Mid 2012 MBP made no noticeable difference unfortunately. A hard power off resulted in the same stuck loading bar.
Kudos to @Kaltsas, I tested your workaround to get a mac up and running again by booting into single user mode and it booted perfectly! Took a good couple of minutes but it came right. Good to have a consistent workaround until Apple get this sorted!
I'll keep searching the interwebs like we all are for possible workarounds. Hopefully something will come soon!
Hang in there people... we're all feeling the pain!
Just wanted to add my thanks for @Kaltsas workaround.
I raised a bug report with Apple with both the startup hang and the hang after login on a FV2 Mac
The bug for the startup hang was closed as a duplicate around a week ago but so far nothing for the FV2 bug.
Can the boot scripts be edited to include clearing the boot cache playlist? It would slow the boot process down I suppose, but it would possibly kludge this as a fix.
I've tried all of the above suggestions which did not provide a consistent fix. But this does.
On the client only, you may hijack the unused /etc/rc.server bash hook, eg single user boot:
bash-3.2# mount -uw /
bash-3.2# /usr/bin/nano /etc/rc.server
#!/bin/sh
/bin/echo BootCacheKludge Beta 1.0 - Chris Hotte 2015 - No rights/blame reserved.
/usr/sbin/BootCacheControl jettison
Boots are now completing %100 of the time.
Edit: We are now beta testing this workaround on ~50 machines.
Enjoy!
@chris.hotte we're trialling your fix on a few machines and it's looking very positive. We've had 3 so far that were refusing to boot and after going into single user mode and editing rc.server they've booted first go. I've got one test MacBook Pro by my side and can now consistently hard power it off and boot back up without it getting stuck at 50%!
Great work! Will continue to trial and post results.
@Kaltsas booting into single user mode and using fsck fixed it for me. Is there anything in particular that leads you to believe it is an issue with being bound to AD after forced shutdown?
Hi,
So I guess the question to Chris Hotte and mkremic is whether you ran fsck when you were in single user mode to do the rc.server hack. If so can you see if it was simply the fsck that fixed the issue as it was for Tim?
Regards,
David
@dlondon, nope there was no need to run fsck to do the rc.server hack.
One of our users had told us he had hard powered off before seeing the loading bar issue initially as well which is what we had suspected. The handful of Macs we've tested on today have booted first go after applying this hack.
Cheers
Also I can safely delete the rc.server file when logged in the OS, and if I hard power off the Mac again it freezes at 50%. Just FYI in case anyone else is keen on testing this in their environments.
Wow, top marks to Chris Hotte: I can confirm that fix is working perfectly in my environment.
I'm going to be deploying this to 50+ macs also over the next few days.
@chris.hotte so what would the script be to get this fixed? I am running into issues at our place. Also would this be a login script?
Thanks
I'm trying to write up a doc for our support guys around the country in case they run into this, but it still seems there's a fairly large discrepancy as to the actual cause - some say it's AD, some say FV2, I thought I saw somewhere that it has to do with the machine's having power failure, others mention Sophos...
I've seen a handful of these, but we don't use FV2 or Sophos, a power failure is possible but on laptops unlikely. We do use AD, but i've got a about 150 10.10.x machines on AD and I've only seen this 3 or 4 times.
Can someone summarize the issue and workaround?
@dlondon Consistently in house I can replicate with the tried and true rtrouton method
Setup 10.10
Bind to AD
Login and pull power
Boot and it's hung.
No Casper agent, AV, no extras. It's an OS issue, confirmed and replicated by Applecare Enterprise Support. Sometimes rebooting a few times, zapping PRAM, fsck, and other diagnostic processes will cajole a system back to life. But that resolution is inconsistent.
Consistently the following procedure will fix an affected machine. This has also been confirmed and replicated by Applecare
/sbin/mount -uw /
rm -rf /System/Library/Caches/*
rm /private/var/db/BootCache.playlist
reboot
I have had one report from tech ops that this procedure did not work on a system but they formatted it before I was able to look at it. I suspect one of the techs did not follow the procedure exactly and it was not mounted as read/write.
I am going to be testing the @chris.hotte process and inform our Apple support contact of the effort.
Hi,
So I guess the question to Chris Hotte and mkremic is whether you ran fsck when you were in single user mode to do the rc.server hack. If so can you see if it was simply the fsck that fixed the issue as it was for Tim?
Regards,
David
I'm not running fsck during my tests. It will run on its own when the disk is marked dirty, or rather - not clean. Given that its run automatically after a forced power down when the file system is not clean - we can safely rule out fsck as a fix. You can confirm fsck runs when the file system is dirty with verbose boot.
@chris.hotte so what would the script be to get this fixed? I am running into issues at our place. Also would this be a login script?
Thanks
Are you asking how to distribute a copy of /etc/rc.server?
We don't yet have casper licenses for our workstations, so we don't use it to roll out fixes. Currently I use a daemonized rsync server configured with anonymous modules. See the rsyncd.conf man page. This distribution tool has worked for us for years without hiccup. So treating the rsync deamon as a repository we just sync whats needed for example on a login hook script as you suggested.
Ultimately, it doesn't matter how you distribute this command into /etc/rc.server. So long as you don't overwrite the file on 10.10 server.
Note that you don't even have to mark it executable since its called by bash. I only noticed it yesterday because bash was throwing an error in verbose mode. Same goes for BootCacheControl. I guess that's what happens when you stare at an issue for a few days in a row.
Adding my two cents and what worked for me:
Even after following all the suggested fixes above, the issue still persisted for me. I looked at the syslog once again but this time noticed a whole bunch of errors stating that it could not create var/folders. So I took the advice posted on an older Apple Support forum and created /var/folders manually. After reboot, system launched successfully!
Under single-user mode:
mount /sbin/mount -uw /
cd /Volumes/Macintosh HD/
mkdir var/folders
mkdir var/folders/zz
reboot
Link to forum post: https://discussions.apple.com/thread/4960066