launchd StartInterval problem

AVmcclint
Honored Contributor

OSX 10.9 and 10.10 Macs in a corporate environment. I've finally got a mechanism in place that is designed to check the uptime of the computer it is run on, and if the uptime is greater than 14 days, it then displays an applescript "nag screen" to ask the user to Shutdown, restart or cancel. If the user clicks cancel on the dialog box I have a StartInterval defined in the .plist to run again every 2 hours (7200 seconds) to prompt them again. It works wonderfully except that when the computer's uptime is in excess of the prescribed number of days and the script runs and the user clicks on Cancel again and again, the StartInterval will work only 2 or 3 more times and then it won't automatically run again. I've found that if I set the StartInterval to an extremely short time like 60 seconds, it will continue to run no matter how many times the user clicks cancel. When I set it to longer than about 5 or 10 minutes, that's when it will only run a couple times before stopping.

Here is my LaunchDaemon plist located in /Library/LaunchDaemons/. As you can see, all it does is run a up.sh script every 2 hours:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
  <key>Disabled</key>
  <false/>
  <key>EnvironmentVariables</key>
  <dict>
  <key>PATH</key>
  <string>/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/X11/bin:/usr/local/git/bin</string>
  </dict>
  <key>Label</key>
  <string>com.company.uptime</string>
  <key>Program</key>
  <string>/Library/Application Support/company/up.sh</string>
  <key>RunAtLoad</key>
  <true/>
  <key>StartInterval</key>
  <integer>7200</integer>
</dict>
</plist>

Here is the content of the up.sh script that checks the uptime and if the uptime is greater than 14 days, then then runs time.sh script :

#!/bin/bash

up="`uptime | sed -n -e 's/.* up ([0-9][0-9]*) day.*/1/p'`"
if [ -n "$up" ] && [ "$up" -gt 14 ]
  then /Library/Application Support/company/time.sh
   fi

And here is the content of the time.sh script that displays a prompt with 3 buttons Shut Down, Restart and Cancel. Shut Down and Restart do exactly that. Cancel just makes the window go away ... until the next 2 hour interval when the LaunchAgent runs again:

#!/bin/bash
VARIABLE=$(/usr/bin/osascript <<-EOF
tell application "System Events"
    activate
    set question to display dialog "This computer has not been shut down or restarted for more than 14 days. Please restart as soon as possible to ensure that all security patches deployed to the computer are applied. This also helps the computer run at optimum performance. This message will be repeated every 2 hours until you restart." with title "company notice: RESTART YOUR COMPUTER" buttons {"Shut Down", "Restart", "Cancel"} with icon file "Blue-AV-Mark.icns"
    set answer to button returned of question
    if answer is equal to "Shut Down" then
        tell application "System Events"
            shut down
        end tell
    end if
    if answer is equal to "Restart" then
        tell application "System Events"
            restart
        end tell
    end if
    if answer is equal to "Cancel" then
        return
    end if
end tell
EOF)


$VARIABLE
exit 0

Like I said, for very short intervals where the user clicks Cancel, it seems to work just fine, but when I try to set the interval at 7200 seconds it only works a couple times when the user clicks Cancel. If I were to unload and then load the LaunchDaemon again, it will work again for a few times, but obviously that defeats the purpose of this mechanism. Does anyone have any ideas for why the longer StartInterval doesn't seem to work?

10 REPLIES 10

GaToRAiD
Contributor II

First thing I would do is when it stops prompting, I would check the launchctl to see if the daemon is even loaded. If it is still loaded, see if launchctl is having an issue calling it.

Also, just as a failsafe, if the hit cancel, instead of just returning null, I would reload the launchdaemon. That way it for sure gets reloaded.

AVmcclint
Honored Contributor

I've done that. It does still appear to be loaded. I've watched the system log in Console while I'm waiting for it to run, but I never see any signs of failure logged. And I would figure that if it was going to have a problem running, it would be an all-or-nothing proposition... not "run 3 times then fail"

mm2270
Legendary Contributor III

Similarly to what @GaToRAiD mentioned, run
sudo launchctl list | grep "name_of_launchd"
Once you see it stops prompting. If there are errors with the LaunchDaemon it should show an error code of some kind in the status column (second column) instead of a 0

Just curious, but what is the reason for setting this in the launchd plist?

  <key>PATH</key>
  <string>/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/X11/bin:/usr/local/git/bin</string>

Did you find it necessary to explicitly set a PATH setting for it to work?

AVmcclint
Honored Contributor

The PATH was inserted when I used Lingon to build it. I subsequently switched to LaunchControl for a much easier interface. I just left the PATH in there and didn't know if I should remove it or not.
Here's what launchctl list tells me:

sudo launchctl list | grep com.company
-   0   com.company.uptime

LaunchControl also indicates that it is loaded. I'm stumped. I figure it should either fail always or work always.

mm2270
Legendary Contributor III

Well, next thing to do would be to use LaunchControl to add in both StandardOutPath and StandardErrorPath keys to the launchd job. Specify the locations where you want to resulting logs to get dumped to (or leave them at the default values). Unload and reload the job and let it run. It should send output to those files when it runs. If there are any errors when it runs, it would show something in the StandardErrorPath log file that may clue you in to what's going on.
It may turn out to be an issue with the fact that its calling some Applescript code. Maybe there is some limit there that we're just not aware of.

AVmcclint
Honored Contributor

Hmm interesting. I set the StandardErrorPath and checked it after a few fails and discovered an error:

execution error: An error of type -10810 has occurred.

Googling indicated this was an AppleScript error that has a few causes but one of them was "This error sometimes appears when running a script that requires a GUI, but the script is being run as root. " I took that as an indication that it might be better to run this as a Global Agent instead of a Global Daemon. I unloaded it, moved it to /Library/LaunchAgents/ and loaded it again with a StartInterval of 300 seconds (5 minutes) and for the last hour it seems to be consistently running every 5 minutes after I click Cancel. I've changed the interval to 1 hour to test the longer time span. If this works, I'll post the full details in a Tips & Tricks article because I know something like this is greatly needed out there.

mm2270
Legendary Contributor III

Yep, as I said (Edit: Actually I never did say this, but it was on my mind, so I was thinking it could be the culprit :), Applescript being run in a root context as you are doing with a LaunchDaemon does not always work so well. Its hit or miss as you have seen. Its part of the reason I've stopped trying to use Applescript for asking for input or general user dialogs in many scripts. Its too iffy.

donmontalvo
Esteemed Contributor III

Looking for something related to StartupInterval and found this thread.

I believe a requirement for launchd is <key>ProgramArguments</key> must exist.

I see <key>Program</key> but not <key>ProgramArguments</key>, typo?

See Creating a launchd Property List File section on this page.

--
https://donmontalvo.com

benshawuk
New Contributor III

Have a look at the "SuccessfulExit" key:

<key>SuccessfulExit</key>
<false/>

If an app exits "successfully" (technically, with an exit code = 0) then the app will not be automatically restarted (kept alive).

mww
New Contributor

Hi AVmcclint,

Do you remember whether you ever resolved this problem? I'm having the same issue. I'd like my script to run every 4 hours (7200 seconds), but it seems pretty hit or miss at to whether/when exactly it will run.

I'd appreciate any tips or suggestions!