08-11-2022 03:28 PM - edited 09-05-2022 03:34 PM
Howdy... I’ve detected an issue where the Jamf binary stops checking in, this can be for hours, days, weeks or even months. This is evident when Macs have run their Inventory Update for "x days". It would appear that the Jamf binary begins its check-in process but never completes, this stops any further check-in attempts as the process is still running and wont attempt to check-in until the original process has completed. If you attempt to manually check-in you will get a similar error to:
This policy trigger is already being run: root 88591 0.0 0.0 34245156 1048 ?? Ss 21Jul22 0:03.85 /usr/local/jamf/bin/jamf policy -stopConsoleLogs -randomDelaySeconds 300
I suspect this issue is caused by a network interuption when a policy is running or a script within a policy that cannot complete (the softwareupdated process has been hanging on some versions of macOS Big Sur). There does not appear to be a time-out for the Jamf binary.
CasperCheck or the new Jamf-Management-Framework-Redeploy API function do not resolve ths issue. Killing the Jamf binary resolves the issue and the Mac can check-in with the jamf server again. Since getting access to a Mac that isn't checking-in is can be somewhat difficult, I have made something I am calling "Jamf Restart" which lifts ideas and code from CasperCheck and AppProcessKiller.
Jamf Restart consists of:
The LaunchDaemon and Script can be packaged and deployed via Jamf (a lot easier to do when all Macs are checking-in). You can launch the LaunchDaemon with the command
/bin/launchctl load /Library/LaunchDaemons/com.example.jamfRestart.plist
Once Jamf Restart has been deployed to a Mac, it will check if the process for the Jamf binary has been running for more than 1 day, if the Jamf binary has been running for more than a day, it will kill the process. After the Jamf binary has been killed, the next scheduled check-in will run correctly. Any policy run from Jamf can reasonably be expected to complete within 1 day, so killing the process when in it has been running for so long will not stop any policy with any chance of success from completing.
Jamf Restart will not fix the "Device Signature Error” which stops the Jamf binary from running, I have been testing the Jamf-Management-Framework-Redeploy API function for that.
Jamf Restart should ensure Macs keep checking into Jamf, and will allow you to identify which Macs have had issues checking in, so they can be investigated further.
The LaunchDaemon:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.example.jamfrestart</string>
<key>ProgramArguments</key>
<array>
<string>sh</string>
<string>/Library/Scripts/jamfRestart.sh</string>
</array>
<key>RunAtLoad</key>
<false/>
<key>StartInterval</key>
<integer>86400</integer>
</dict>
</plist>
The Script:
#!/bin/bash
processRuntime=$(ps -ax -o user,pid,etime,args | grep "/usr/local/[j]amf/bin/jamf policy -stopConsoleLogs -randomDelaySeconds 300" | awk '{ print $3; }' | grep -o '.*[-]' | awk -F\- '{print $1}')
processCheck=$(ps -ax -o user,pid,etime,args | grep "/usr/local/[j]amf/bin/jamf policy -stopConsoleLogs -randomDelaySeconds 300" | awk '{ print $2; }')
logLocation="/var/log/jamfRestart.log"
scriptLogging(){
DATE=`date +%Y-%m-%d\ %H:%M:%S`
LOG="$logLocation"
echo "$DATE" " $1" >> $LOG
}
if [ "${processRuntime}" = "" ]; then
scriptLogging "JamfBinary has not run for more than 1 day"
else
scriptLogging "JamfBinary has run for ${processRuntime} days"
scriptLogging "JamfBinary Process ID: ${processCheck}"
scriptLogging "Quitting JamfBinary..."
sudo kill -9 ${processCheck}
fi
exit 0
The Extension Attribute:
#!/bin/bash
jamfRestart=`/usr/bin/tail -10 /var/log/jamfRestart.log | grep "JamfBinary has run for"`
if [ "$jamfRestart" != "" ]; then
echo "<result>$jamfRestart</result>"
else
echo "<result>No Restarts</result>"
fi
Hopefully this is of use to someone.
Posted on 08-12-2022 04:15 AM
I've been experiencing this for at least 5 months. I can't wait to give this a try!
Posted on 08-12-2022 04:34 AM
I may be totally glossing over something but, why are we not just rebooting the devices regularly? Set a FV auth reboot policy to run daily or weekly if the devices are unattended. Or make an EA to read up time and reboot devices that have not rebooted in XYZ days. It just seems like a lot of hoops to just bounce daemons and agents which restart when you reboot a Mac.
Posted on 08-12-2022 05:07 AM
The problem is that if the jamf process is stuck, then it won't ever check in to run any policies. Currently I have been reaching out to affected users (which sometimes is over 100) and politely asking them to restart their computers. I only get about 20% response rate with that. I like what @jamesandre has done here because it makes the computer fix itself. If we see repeated instance of it happening on a computer, then we can focus our support efforts only on those.
Posted on 08-12-2022 05:20 AM
Seems like a long way around to avoid forcing Macs to reboot on a schedule preemptively before any problems occur. I absolutely see a usefulness in something like this as a fall back, but the main solution should simply be dont let the Macs get an up time of more than a week or so.
On a side note if SSH is enabled, and you have a local admin accounts on the macs that you have access to. You can just SSH the devices to do whatever you need to. May be helpful for the users who dont respond, just run sudo shutdown -r now on their devices. Not the nicest of solutions, but its also not nice to ignore emails.
Posted on 08-12-2022 05:56 AM
The SSH solution is blocked by the fact that many users aren't on the local network. They are at home behind their NAT'd home routers. And if they are on the company LAN, they aren't checking in with Jamf to update their IP address. Getting users to restart their Macs is a never-ending battle. It doesn't help that the jamf process gets stuck from time to time. Implementing this will not be as simple and pushing the policy out to all computers. We will still need to reach out to users who are already in this state of non-check-in and get them to restart. But once that's done and they get the jamfRestart Daemon and script, then my Forced Reboot policy can play a more active role in keeping all the systems functioning properly.
Posted on 08-12-2022 05:39 AM
I can say that most of the computers I am discovering this problem on have uptimes in excess of 30 days - some as high as 100 or more. I think the lowest uptime I've seen this affect was about a week. I am proposing to my management that I implement a forced reboot policy that will tell (not ask) the user that the computer will restart in 2 hours when it is detected that their uptime is greater than [TBD] days. The weak link in that is if the jamf process is stuck, then that policy will never run. Automating the restart of the jamf process is a great idea that I wish Jamf would implement on their own.
Posted on 08-12-2022 09:01 AM
why not just create a uptime script which runs locally. If the device has been up for x number of days then force a restart (which should fix your issue).
Posted on 08-12-2022 10:22 AM
I've considered that as well, but the threshold may not be etched in stone. This month, management wants to limit it to 30 days, next month they may change their minds and want to limit it to 14 days. It's a lot easier to change the script variable than it is to push out a new script. And there's no easy way to know which version of the script a Mac may have on it at any given moment.
Posted on 08-12-2022 12:59 PM
This is a bigger problem with "clients" not enforcing some sort of reboot/update schedule with hard dates. I have Macs that have not rebooted for MONTHS! Can we get client reps to let us do something? No. So then I just say "it's broken until you let us manage these Macs with common sense or you learn your peeps on proper computer use."
Nobody has the sack to say OK to any of it.
Posted on 08-12-2022 10:33 PM
Most of the Macs won't be reachable via SSH. I'm not going to regularly check a Smart Group for Macs that haven't checked in, then try and connect to them via SSH, then kill the Jamf binary. That sounds like too much work, I'm going to automate it.
Forcing everyone to restart after an arbitrary amount of days is going to create an increase in the number of calls to the Helpdesk. I do not want people contacting the Helpdesk because they have a deadline or presentation and they want to cancel a forced restart, that's not fair on the Helpdesk or person using the Mac and I want to ensure a good relationship between the two.
The Mac is working fine, the binary is the part that is causing the issue for me (and not the person using the Mac). I'm also not going to force everyone to restart when it may only be 10% of Macs affected, I want to keep people happy.
By implementing this method rather than waiting to restart every 7 or 30 days, I can ensure the Jamf binary checks in at least every day. If I have a patch for a zero-day vulnerability, then I want the Macs to be checking in at least every day. As long as security updates are getting applied, I do not have an issue with long uptimes. It also gives me visibility over which Macs are experiencing an issue with the Jamf binary.
Posted on 08-12-2022 10:41 PM
And if the Jamf binary would just time out when a process runs to long then that would be great. Maybe it could self heal when it gets a device signature error too. Then I could get on with managing Mac, rather than managing the thing that manages the Macs. 🤠
Posted on 08-16-2022 04:42 AM
@jamesandre I was going over the Extensions Attribute code and shouldn't the tail command look at the very last line instead of the last 10 lines?
tail -1
instead of
tail -10
Posted on 08-16-2022 02:21 PM
I'm looking for the log entry "JamfBinary has run for X days" which won't be the last line in the log file. It should be in the last 10 lines of the log file depending on how often you run an Inventory Update. You could increase it if you want a better idea of history.
I also have a Smart Group that looks for like "JamfBinary has run for" within the Jamf Restart Extension Attribute.
Posted on 09-05-2022 07:30 AM
Thanks for sharing, @jamesandre !
I'm just testing it in our environment, and will deploy it wider soon.
Just a heads up that there's a typo in your post;
This line:
A log file (/Library/Scripts/jamfRestart.sh) that captures that output of the script.
should be:
A log file (/var/log/jamfRestart.log) that captures that output of the script.
thanks again!
Posted on 09-05-2022 03:35 PM
Thank you! I read it over a million times and never spotted it. 😿
Posted on 09-13-2022 06:06 AM
This seems like a great idea
@jamesandre
I'm trying to wrap my head around how this will all be doable in my environment.
From the looks of it, we'll have to use Jamf Policy to push the launchdaemon using:
/bin/launchctl load /Library/LaunchDaemons/com.example.jamfRestart.plist
But, the problem is that anyone who's currently affected by a stuck jamf policy won't get this new policy.
In terms of deployment, is there something more simple i'm not seeing here?
Posted on 09-13-2022 08:52 AM
You are correct. That's the Catch-22 situation. To work around this, I've been reaching out to Macs that I can identify as not having checked in for a while. I let the users know there's a technical glitch on their Mac that needs to be addressed by a restart. Once they restart, then the policy runs to install the LaunchDaemon. So far it seems to work ok.
Posted on 09-18-2022 06:29 PM
Yeah, not the easiest. But generally a restart will allow the policy to install. I also set the policy to install on "Network State Change" as this seems to work as a seperate process.
Posted on 01-10-2023 09:38 AM
hi @jamesandre,
"Jamf Restart will not fix the "Device Signature Error” which stops the Jamf binary from running, I have been testing the Jamf-Management-Framework-Redeploy API function for that."
Did the API function fixed the error?
Posted on 03-01-2023 04:51 PM
As long as MDM commands are still working on the Mac, then yes it will repair the jamf Binary and Device Signature Error. I'm using the Jamf Heal Script for this.
Posted on 12-19-2022 02:58 AM
I added a signed configuration profile created in iMazing Profile Editor, so be able to not allow the user to disable the launchdaemon, thought I share.
Open iMazing: Search for "Service Management - Managed Login Items"
Add label: com.example.jamfrestart
Save profile and sign it.
Upload to jamf and deploy.
This makes sure users can't disable it.
01-17-2023 06:22 AM - edited 01-17-2023 06:28 AM
@jamesandre
is there a way to test this out on a Mac that is working fine?
Posted on 03-01-2023 05:49 PM
In the script you could substitute the Jamf binary for an App that you've had open for more than a day, say
"/usr/local/[j]amf/bin/jamf policy -stopConsoleLogs -randomDelaySeconds 300"
for
"/Applications/Microsoft Outlook.app/Contents/MacOS/Microsoft Outlook"
Then run the script.
Posted on 01-27-2023 06:05 AM
Did anyone test this solution somehow?
@jamesandre, also, in your LaunchDaemon, the key is set to False...Wasn't supposed to be True?
<key>RunAtLoad</key> <false/> <key>StartInterval</key>
And the EA, what is it for? Is it just a sort of sanity check, to see if the whole solution worked?
03-01-2023 05:06 PM - edited 03-01-2023 05:07 PM
You can RunAtLoad if you want.
The EA can be used to identify which Macs have had the Jamf Binary running for more than 1 day, use a Smart Group with JamfRestart | like | JamfBinary has run for Then you can investigate why it is getting stuck... probably softwareupdated.
02-09-2023 06:51 AM - edited 02-09-2023 06:52 AM
I have a question if you please. Do we upload plist via Profile, create policy with
/bin/launchctl load /Library/LaunchDaemons/com.example.jamfRestart.plist
But where we input Script? Also in this policy?
Posted on 03-01-2023 05:09 PM
You install the script (say via a package) to the Mac in this location;
/Library/Scripts/jamfRestart.sh
It has to run on the Mac, as the JamfBinary may not be working and not checking in.
Posted on 04-25-2023 09:22 AM
i mashed together a single script that should create the files in the proper locations and start the launchdaemon. this makes it easier to install via a jamf policy for newbies like me. you still need to setup the computer extension attribute for the reporting.
#!/bin/bash
cat << 'EOF' > /Library/Scripts/jamfRestart.sh
#!/bin/bash
processRuntime=$(ps -ax -o user,pid,etime,args | grep "/usr/local/[j]amf/bin/jamf policy -stopConsoleLogs -randomDelaySeconds 300" | awk '{ print $3; }' | grep -o '.*[-]' | awk -F\- '{print $1}')
processCheck=$(ps -ax -o user,pid,etime,args | grep "/usr/local/[j]amf/bin/jamf policy -stopConsoleLogs -randomDelaySeconds 300" | awk '{ print $2; }')
logLocation="/var/log/jamfRestart.log"
scriptLogging(){
DATE=`date +%Y-%m-%d\ %H:%M:%S`
LOG="$logLocation"
echo "$DATE" " $1" >> $LOG
}
if [ "${processRuntime}" = "" ]; then
scriptLogging "JamfBinary has not run for more than 1 day"
else
scriptLogging "JamfBinary has run for ${processRuntime} days"
scriptLogging "JamfBinary Process ID: ${processCheck}"
scriptLogging "Quitting JamfBinary..."
sudo kill -9 ${processCheck}
fi
exit 0
EOF
chmod 644 /Library/Scripts/jamfRestart.sh
chown root:wheel /Library/Scripts/jamfRestart.sh
cat << EOF > /Library/LaunchDaemons/com.example.jamfRestart.plist
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.example.jamfrestart</string>
<key>ProgramArguments</key>
<array>
<string>sh</string>
<string>/Library/Scripts/jamfRestart.sh</string>
</array>
<key>RunAtLoad</key>
<false/>
<key>StartInterval</key>
<integer>86400</integer>
</dict>
</plist>
EOF
chmod 644 /Library/LaunchDaemons/com.example.jamfRestart.plist
chown root:wheel /Library/LaunchDaemons/com.example.jamfRestart.plist
/bin/launchctl load /Library/LaunchDaemons/com.example.jamfRestart.plist
Posted on 04-25-2023 10:32 AM
@MCfreiz Nice...just out of interest how can this be tested?
Posted on 07-26-2023 07:19 PM
Hi @jamesandre, do you think it's necessary at all to call a jamf recon and/or jamf policy once the binary has been killed?
I'm thinking of implementing this into my organization but just want to be sure there isn't anything additional I should throw in to get the devices talking back immediately.
Posted on 08-14-2023 07:17 PM
You can add that if you want, should not cause any harm.
Posted on 09-05-2023 11:30 PM
@jamesandre Just curious why you are choosing to look for:
/usr/local/jamf/bin/jamf policy -stopConsoleLogs -randomDelaySeconds 300
instead of just
/usr/local/jamf/bin/jamf policy
I get this message if I try to interrupt:
/usr/local/jamf/bin/jamf policy -event CLIENT_CHECKIN -stopConsoleLogs
Would it better to look for just "jamf policy" so you catch more? Is there a downside? What if the binary gets hung during an inventory cycle? Should we also be looking for other actions like recon?
Posted on 09-11-2023 06:57 PM
Testing my memory here... I think "/usr/local/jamf/bin/jamf policy -stopConsoleLogs -randomDelaySeconds 300" was what I was seeing every time we had an issue.
Looks like it has changed to "/usr/local/jamf/bin/jamf policy -stopConsoleLogs -runOnQueue -randomDelaySeconds 300", so looking for "/usr/local/jamf/bin/jamf policy" might be a better option now.
It doesn't seem to be such a big issue now, I'm not seeing issues with softwareupdated anymore.
Posted on 09-27-2023 06:21 PM
@jamesandre newbe here, thank you for this, I have been searching for a solution to this problem for a while now. I am trying to implement your solution as we have been running into this issue with about 30 or so macs not checking in and with the amount of developers we have, scheduled restarts are not a viable option.
During testing I am failing to get the LaunchDaemon to load. When using the command sudo /bin/launchctl load /Library/LaunchDaemons/com.example.jamfRestart.plist I am getting the error: Load failed: 5: Input/output error.
I have verified that the .plist file and script are in the right places. Any ideas on what could be causing this?
Posted on 10-16-2023 04:23 PM
@jamesandre I'm seeing the same error as @bsmithAP : Load failed: 5: Input/output error
Any solutions for this?
10-25-2023 07:01 AM - edited 10-25-2023 07:10 AM
I'm just coming across this, because I'm also seeing a fair number of devices not checking in, even though the Macs are confirmed to be online.
I noticed a typo in your script. You define the variable for the log location as logLocation, but then in the function you are echoing out to a variable labeled $LOG
echo "$DATE" " $1" >> $LOG
Other than that, good work on this. I plan on doing some testing with it to see if it improves our situation.
Never mind! I see now that you define LOG by assigning it to $logLocation. All good!
As for those saying just reboot your Macs, well, yeah, that is ideal, but it's much harder to enforce in some environments than you might think. Things just aren't so cut and dry as that, so for now, this might help us out. Thanks again for posting it @jamesandre
Posted on 11-05-2023 02:52 PM
I hope it helps. I'm not able to edit the original post to update the script. I should have put it on GitHub... maybe one day.
Posted on 12-11-2023 05:12 AM
I dont know why, but I am also getting error like @bsmithAP
Load failed: 5: Input/output error
Posted on 12-11-2023 05:19 AM
Expecting a LaunchAgents path since the command was ran as user. Got LaunchDaemons instead.
`launchctl bootstrap` is a recommended alternative.
Load failed: 5: Input/output error
Try running `launchctl bootstrap` as root for richer errors.