@bmarks i'd take one lab, & isolate their issues before looking at the whole.
I tried. But I just can't seem to notice any pattern. I used the exact same settings in AutoCasperNBI, the only differences being my base OS version and Casper imaging version. I've been focusing my testing on our London lab. At first, it booted nothing. Then, I recopied it and it started working. However, it only worked for about 20 out 30 NetBoot attempts across both current and previous-generation hardware. These intermittent results seem to be the same at our other labs. It's so confusing. I used ARD to copy it to the Desktop and then manually moved it to the appropriate folder. Nothing I've done has been different than my previous NetBoot image updates over the past couple of years.
@bmarks Well, NetBoot has a heap of variables.
What OS NBI's were working before?
What's not working now?
What server is hosting the NBI's?
I have noticed a similar problem. The trick was to NOT reduce the image size when creating the netboot image.
Unticking reduce image size seems to have resolved it for me.
I have had this issue too. The only change in my workflow was using a 10.12.x based image.
@a.stonham I never check the reduce image size box and I still have had this issue. I will build a new one to test and 100% verify and will report back.
@a.stonham & @powellbc What server is hosting the NBI's? NetSUS?
I am using macOS Server 10.12. I had been using an earlier version problem (I think 10.10 or 10.9) when the appeared. I hoped upgrading would solve the issue but no dice.
@powellbc Odd.. I have a number of our customer using AutoCasperNBI &/or AutoImagrNBI 10.12.x NBI's on macOS server 10.9+, & it's happily working.
Once or twice, I have had to change from NFS to HTTP. Might be worth a shot?
These are always reduced size NBI's too.
One other thing, is "Install modified rc.netboot" checked when creating the NBI? I normally leave this checked.
@bentoms
I have tried a bunch of things to resolve the issue, including serving over HTTP to no avail. I do check the "Install modified rc.netboot" and as mentioned I do not check reduce image size. Do you recommend I try the settings you described (reduce size, and check "Install modified rc.netboot")?
@powellbc Can you verbose booted a mac & see if that helps track down where this issue is occurring?
For me, I have had the issue on Mac imaging servers running OS X 10.10.5, 10.11.6, macOS 10.12.3, NetSUS version 3 and NetSUS version 4 (I have at least one of each from past testing environments.) I do check off the "Install modified rc.netboot" checkbox. I created one yesterday that isn't reduced in size but I haven't tested it yet. Just to repeat, not counting these test NBI's, the only change I initially made was the base OS from El Capitan to Sierra. I can try and get a verbose login pic too.
However, just to repeat as well, it sometimes works on all of the above as well. That's what's so challenging.
Have you been able to rule out any funny business with the network? We had an issue that was very difficult to diagnose with our Palo Alto firewall. It seemed to think NetBoot traffic was a packet-based attack and would intermittently prevent Macs from booting.
In our case every other boot disk we have available works 100% of the time. Only the 10.12.x ones have issues.
We've had some of those types of issues in the past, but they've always been isolated to one imaging lab (I manage 40 imaging labs.) In this case, I'm pretty sure I've rules out those types os issues, especially since our previous non-Sierra images seem to still work fine.
I've been using AutoCasperNBI basically since it launched and it's a great app. I've never had an issue until now. I guess I shouldn't say that though since this may not be an issue with AutoCasperNBI.
I'm not sure what these log entries mean, but my coworker testing this in another lab sent me these log entries that he says are only occurring when a Mac doesn't boot from the NBI:
Mar 22 17:47:36 caspershare-lon2.internal.pretendoco.com servermgr_netboot[7036]: updateHTTPSharepoint: default site was nil using default path
Mar 22 17:47:36 caspershare-lon2.internal.pretendco.com servermgr_netboot[7036]: updateHTTPSharepoint: received error from servermgr_web Error Domain=com.apple.servermgrd Code=3 "The operation couldn't be completed. (com.apple.servermgrd error 3.)
As a side note, not shrinking the image doesn't seem to help.
And, I don't know if these things are even related, but since I see "HTTP" in the above logs, I'll mentioned that we use NFS for our NBI's.
Verbose boot showed tons of errors saying Caller not allowed to perform action: smd:209 action = service removal, code =150: Operation not permitted while System Integrity Protection is engaged'
I saw above that
error rreading http code, returning kIOReturnInteral Error'
...
_peerManager is missing
At this point it seems to be repeating the top error repeatedly, and boot never completes.
I was just sent similar logs. Why would SIP be triggered though? It's not relevant when you're NetBooting from the Boot Picker, right?
@powellbc & @bmarks NetBoot Images have SIP enabled, (Apple keeps it enabled so ACNBI does too).. NetInstall do not.
What are the permissions on the NBI's? The folder & contents
Here are the permissions for the NBI and the contents of the NBI. Do you want me to go deeper? FYI, just to be clear, "macinstaller" is our local admin user on these Mac imaging servers. Do you want me to go deeper?
drwxrwxr-x 5 root admin 170B Mar 22 12:21 macOS_Imaging_All_Macs_V3.nbi
-rw-r--r-- 1 macinstaller staff 2.8K Mar 21 13:32 NBImageInfo.plist
-rw-rw-r-- 1 root admin 8.5G Mar 21 13:33 NetBoot.dmg
drwxr-xr-x 5 macinstaller staff 170B Mar 22 12:21 i386
@bmarks Can you try root:admin throughout the NBI, please
Testing now. This may take a little while.
This doesn't not appear to make any difference.
@bmarks ok.. does server.app show HTTP being used or NFS? I know you mentioned you selected NFS, but the logs seem to show HTTP..
Whatever it is, try the other.
I just checked to be certain and it is definitely set to NFS.
I'll try the other option now.