9.8 Upgrade Self Service Down

npynenberg
Contributor

Did an upgrade to 9.8 this morning, seemed to go smoothly.

Got a ticket about Self Service throwing up an error. I checked, and sure enough Self Service is erring out on every computer I tried.
Checked the Console log on the client:

9/18/15 9:09:19.597 AM Self Service[276]: [ERROR] -[JAMFBinaryCommunication notifyThatDaemonIsAbsent] (line:299) --> The daemon is not present

Any ideas? I finally gave up and tried restarting the JSS to no avail. Tried numerous policies in SS, nothing worked.

65 REPLIES 65

ryan_dean
New Contributor III

I am not seeing the same issue, just upgraded and Self Service is working for me. Have you tried restarting tomcat?

npynenberg
Contributor

Have tried restarting MYSQL, Tomcat, and restarting mac OS based server.

ryan_dean
New Contributor III

Did you shut tomcat down manually before you did the upgrade? i have experienced issues when I did an upgrade and did not manually shut tomcat down.

scottb
Honored Contributor

Saw that on both betas. Going to be testing 9.81 beta today looking for that in particular...

npynenberg
Contributor

No, I used to do that, but haven't done that in some time since I haven't had any issues. Good reminder though. Considering rolling back...

ryan_dean
New Contributor III

Is it only the OS X side of Self Service? I am only testing iOS and it seems to be working fine on my end.

infolasalle
New Contributor II

I update the JSS from 9.73 to 9.8...... I hope this update will be correct my enroll problem !!!!!

scottb
Honored Contributor

@ryan.dean - I only tested OS X in the betas - 10.10.5 and 10.11. Will try to look at iOS as well, but we don't support that yet so it's less of a priority...

JSS is a Windows 2k server, FWIW.

npynenberg
Contributor

We don't use iOS and Casper together.. so couldn't tell you.

infolasalle
New Contributor II

OK, the self service install correctely and automatically like a normal deploy..... For IOS 9 you must install JSS 9.8 or newer

Many thanks to all

npynenberg
Contributor

I decided to downgrade and moved back to 9.7.3.

emily
Valued Contributor III
Valued Contributor III

You could try clearing the System and User cache on the client machine you're testing with.

Aziz
Valued Contributor

We're having the same issue. We manually stopped Tomcat before the upgrade and restarted the server and everything went smoothy. Downloading stuff from Self-Service fails though with no info in the client logs. Currently investigating.

JSS 9.8, Windows Server 2012, Java 8 Update 60.

So far it's pretty random. Works for some, not for others.

Aziz
Valued Contributor

No luck here, anyone else?

Edit: Support case opened, reverting back to 9.73 is also an option.

scottb
Honored Contributor

The policies I was having issues with (went through two betas for 9.8) i deleted today and re-created. They now work in the new beta. Don't know if that will help anyone, but it sorted things out for my issues so far with SS.

bollman
Contributor II

Ouch. We're seeing the same, and 25% of our computers are no longer contacting the JSS either. Double ouch.

tinsun
New Contributor II

I saw the exact same error as the OP, but all of a sudden it started working again. I also noted that no text was shown in Self Service's progress window. It's as if the JSS answered properly. Do we need to change thread count or something to prevent this from happening?

Aziz
Valued Contributor

@bollman None of my OS X 10.9.5 machines are contacting my JSS. Even after installing a quick add package via ARD.

@tinsun Looks like it's randomly starting to work again for us as well.

jordanfleuriet
New Contributor

We are on hosted JSS...new installs via Imaging are not running all of our policies correctly, though are receiving MDM profiles. Self Service errors out when trying to run anything.

bollman
Contributor II

Right, found a solution!
While looking at the LaunchDaemons created by jamf, we found one "extra", that should not be there:
bash-3.2$ ls -al
-rw-r--r-- 1 root wheel 757 22 Sep 11:10 com.jamfsoftware.jamf.daemon.plist
-rwxr--r-- 1 root wheel 474 22 Sep 11:10 com.jamfsoftware.startupItem.plist
-rw-r--r-- 1 root admin 537 22 Sep 11:10 com.jamfsoftware.task.1.plist
-rw-r--r-- 1 root admin 565 18 Sep 11:49 com.jamfsoftware.task.checkForTasks.plist

The one called "checkForTasks" was still running, but contained a reference to the old location of the jamf binary. The LaunchDaemon had in it: manage, and -removeLaunchDaemon.
My analysis tells me that this is a thing from the update to 9.8 that got left behind. The jamf binary had to remove itself and add itself again, and this is the part that did that, but since it got left behind, it was blocking stuff. So, I unloaded the daemon, but that did not help. Deleted the daemon and restarted, no go. But, then, after running jamf manage again, Self Service was working perfectly again. The computer is also connecting to the JSS at scheduled intervals so everything seems to be working fine!

Now, how did this happen? We are suspecting that it has something to do with cases where we see the jamf daemon as "hung" and perhaps this created this situation, there was a running, but hung, jamf process which made the "self erase" fail and leave this behind.

Anyways, we have a solution, but it involes having access to the computer, either remotely or physically.

jordanfleuriet
New Contributor

@bollman good call!

Again, I'm on a hosted JSS and experiencing the same thing.

So far I've seen this exact com.jamfsoftware.task.checkForTasks.plist path issue on any machine that was imaged using pre 9.8 Casper Imaging. A machine I just imaged using the updated 9.8 Casper Imaging did not have the erroneous LaunchDaemon.

EDIT: Tested order of operations to fix
1. Delete com.jamfsoftware.task.checkForTasks.plist manually
2. Run: sudo jamf manage
3. Restart
Result: Self Service works, policies pushed from JSS now work

Any thoughts on how to automate this? Not sure how it could be done from the JSS given all old machines are not pulling policies.

eagleone
New Contributor

@jordanfleuriet Where is the location of com.jamfsoftware.task.checkForTasks.plist?

scottb
Honored Contributor

@eagleone -

/Library/LaunchDaemons/

eagleone
New Contributor

OK...I'm still getting this issue on my upgraded machine (10.11 Beta). I also did not have the com.jamfsoftware.task.checkForTasks.plist launch damon.

scottb
Honored Contributor

I think that really, you folks are going to have to wait got JAMF to release 9.81/9.82 to get where you want to be with 10.11. Remember, it's a beta - and things can/will change and JAMF is chasing their tails until it's finalized. Without saying a lot, the 9.81 beta is working here for most of the stuff I had trouble with on 9.8 betas...

boberito
Valued Contributor

Running sudo jamf manage seemed to fix 1 computer for me so far. No idea about any others yet.

jordanfleuriet
New Contributor

@scottb - this error is occurring for me on 10.10...

scottb
Honored Contributor

@jordanfleuriet - FWIW, I saw that too on 10.10.5 with both 9.8 betas. Not running 9.8 release.

9.81 beta on on both 10.10 and 10.11 so far work.

powellbc
Contributor II

This is happening to me as well on a 10.10 client. Somehow I missed this thread before the upgrade last night, and running "sudo jamf manage" did not fix the issue for me, I had to follow the process @jordanfleuriet posted above.

My client seemed to be pulling policies, but if it cannot download the script, it would be hard to automate. Ugh.

rtrouton
Release Candidate Programs Tester

I upgraded to 9.8 today and set up a policy to run the following script on Macs that now had the 9.8 agent. The launchdaemon and accompanying script created by running this script verifies that the Mac can communicate with the Casper server. Once communication is verified, it takes the following actions:

  1. Runs jamf manage to enforce Casper management (which reportedly fixes this issue and isn't otherwise harmful.)
  2. Runs recon to send an updated inventory to the JSS, to report that the fix has happened.

In the event that the machine isn't checking in, this script could also be run via ARD or installed with a payload-free package.

#!/bin/bash

# If any previous instances of the postcasper98upgrade LaunchDaemon and script exist,
# unload the LaunchDaemon and remove the LaunchDaemon and script files

if [[ -f "/Library/LaunchDaemons/org.github.postcasper98upgrade.plist" ]]; then
   /bin/launchctl unload "/Library/LaunchDaemons/org.github.postcasper98upgrade.plist"
   /bin/rm "/Library/LaunchDaemons/org.github.postcasper98upgrade.plist"
fi

if [[ -f "/var/root/postcasper98upgradefix.sh" ]]; then
   /bin/rm "/var/root/postcasper98upgradefix.sh"
fi

# Create the postcasper98upgrade LaunchDaemon by using cat input redirection
# to write the XML contained below to a new file.
#
# The LaunchDaemon will run at load and every ten minutes thereafter.

/bin/cat > "/tmp/org.github.postcasper98upgrade.plist" << 'CASPER_POST_UPGRADE_LAUNCHDAEMON'
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>org.github.postcasper98upgrade</string>
    <key>ProgramArguments</key>
    <array>
        <string>sh</string>
        <string>/var/root/postcasper98upgradefix.sh</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
    <key>StartInterval</key>
    <integer>600</integer>
</dict>
</plist>
CASPER_POST_UPGRADE_LAUNCHDAEMON

# Create the postcasper98upgrade script by using cat input redirection
# to write the shell script contained below to a new file.
#
# You will need to change the "jss_server_address" variable in the
# script below. Please put the complete fully qualified domain name 
# address of your Casper server.
#
# You may need to change the "jss_server_port" variable in the
# script below. Please put the port number of your Casper server
# if it is different than 8443.

/bin/cat > "/tmp/postcasper98upgradefix.sh" << 'CASPER_POST_UPGRADE_SCRIPT'
#!/bin/bash

#
# User-editable variables
#

# For the jss_server_address variable, put the complete 
# fully qualified domain name address of your Casper server

jss_server_address="casper.server.goes.here"

# For the jss_server_address variable, put the port number 
# of your Casper server. This is usually 8443; change as
# appropriate.

jss_server_port="8443"

CheckBinary (){

# Identify location of jamf binary.

jamf_binary=`/usr/bin/which jamf`

 if [[ "$jamf_binary" == "" ]] && [[ -e "/usr/sbin/jamf" ]] && [[ ! -e "/usr/local/bin/jamf" ]]; then
    jamf_binary="/usr/sbin/jamf"
 elif [[ "$jamf_binary" == "" ]] && [[ ! -e "/usr/sbin/jamf" ]] && [[ -e "/usr/local/bin/jamf" ]]; then
    jamf_binary="/usr/local/bin/jamf"
 elif [[ "$jamf_binary" == "" ]] && [[ -e "/usr/sbin/jamf" ]] && [[ -e "/usr/local/bin/jamf" ]]; then
    jamf_binary="/usr/local/bin/jamf"
 fi
}

CheckSiteNetwork (){

  #  CheckSiteNetwork function adapted from Facebook's check_corp function script.
  #  check_corp script available on Facebook's IT-CPE Github repo:
  #
  # check_corp:
  #   This script verifies a system is on the corporate network.
  #   Input: CORP_URL= set this to a hostname on your corp network
  #   Optional ($1) contains a parameter that is used for testing.
  #   Output: Returns a check_corp variable that will return "True" if on 
  #   corp network, "False" otherwise.
  #   If a parameter is passed ($1), the check_corp variable will return it
  #   This is useful for testing scripts where you want to force check_corp
  #   to be either "True" or "False"
  # USAGE: 
  #   check_corp        # No parameter passed
  #   check_corp "True"  # Parameter of "True" is passed and returned


  site_network="False"
  ping=`host -W .5 $jss_server_address`

  # If the ping fails - site_network="False"
  [[ $? -eq 0 ]] && site_network="True"

  # Check if we are using a test
  [[ -n "$1" ]] && site_network="$1"
}

CheckTomcat (){

# Verifies that the JSS's Tomcat service is responding via its assigned port.


tomcat_chk=`nc -z -w 5 $jss_server_address $jss_server_port > /dev/null; echo $?`

if [ "$tomcat_chk" -eq 0 ]; then
       /usr/bin/logger "Machine can connect to $jss_server_address over port $jss_server_port. Proceeding."
else
       /usr/bin/logger "Machine cannot connect to $jss_server_address over port $jss_server_port. Exiting."
       exit 0
fi

}

CheckLogAge (){

# Verifies that the /var/log/jamf.log hasn't been written to for at least five minutes.
# This should help ensure that jamf manage can run and not have to wait for a policy to
# finish running.

jamf_log="/var/log/jamf.log"
current_time=`date +%s`
last_modified=`stat -f %m "$jamf_log"`

if [[ $(($current_time-$last_modified)) -gt 300 ]]; then 
     /usr/bin/logger "Log has not been modified in the past five minutes. Proceeding." 
else 
     /usr/bin/logger "Log has been modified in the past five minutes. Exiting."
     exit 0
fi

}

UpdateManagementAndInventory (){

# Verifies that the Mac can communicate with the Casper server.
# Once communication is verified, it takes the following actions:
#
# 1. Runs jamf manage to enforce Casper management 
# 2. Runs a recon to send an updated inventory to the JSS to report
#    that the OS upgrade has happened.
#

CheckBinary

jss_comm_chk=`$jamf_binary checkJSSConnection > /dev/null; echo $?`

if [[ "$jss_comm_chk" -gt 0 ]]; then
       /usr/bin/logger "Machine cannot connect to the JSS. Exiting."
       exit 0
elif [[ "$jss_comm_chk" -eq 0 ]]; then
       /usr/bin/logger "Machine can connect to the JSS. Enforcing management and updating inventory."
       $jamf_binary manage -verbose
       $jamf_binary recon
fi
}

SelfDestruct (){

# Removes script and associated LaunchDaemon

if [[ -f "/Library/LaunchDaemons/org.github.postcasper98upgrade.plist" ]]; then
   /bin/rm "/Library/LaunchDaemons/org.github.postcasper98upgrade.plist"
fi
srm $0
}

CheckSiteNetwork

if [[ "$site_network" == "False" ]]; then
    /usr/bin/logger "Unable to verify access to site network. Exiting."
fi 


if [[ "$site_network" == "True" ]]; then
    /usr/bin/logger "Access to site network verified"
    CheckTomcat
    CheckLogAge
    CheckBinary
    UpdateManagementAndInventory
    SelfDestruct
fi
exit 0
CASPER_POST_UPGRADE_SCRIPT

# Once the LaunchDaemon file has been created, fix the permissions
# so that the file is owned by root:wheel and set to not be executable
# After the permissions have been updated, move the LaunchDaemon into 
# place in /Library/LaunchDaemons.

/usr/sbin/chown root:wheel "/tmp/org.github.postcasper98upgrade.plist"
/bin/chmod 755 "/tmp/org.github.postcasper98upgrade.plist"
/bin/chmod a-x "/tmp/org.github.postcasper98upgrade.plist"
/bin/mv "/tmp/org.github.postcasper98upgrade.plist" "/Library/LaunchDaemons/org.github.postcasper98upgrade.plist"

# Once the script file has been created, fix the permissions
# so that the file is owned by root:wheel and set to be executable
# After the permissions have been updated, move the script into the
# place that it will be executed from.

/usr/sbin/chown root:wheel "/tmp/postcasper98upgradefix.sh"
/bin/chmod 755 "/tmp/postcasper98upgradefix.sh"
/bin/chmod a+x "/tmp/postcasper98upgradefix.sh"
/bin/mv "/tmp/postcasper98upgradefix.sh" "/var/root/postcasper98upgradefix.sh"

# After the LaunchDaemon and script are in place with proper permissions,
# load the LaunchDaemon to begin the script's execution.

if [[ -f "/Library/LaunchDaemons/org.github.postcasper98upgrade.plist" ]]; then 
   /bin/launchctl load -w "/Library/LaunchDaemons/org.github.postcasper98upgrade.plist" 
fi 

exit 0

johnklimeck
Contributor II

I am not seeing this issue with 9.8 in our DEV JSS environment (Red Hat Linux)

I am able to do Casper Imaging, run policies via triggers, and use Self Service 9.8,logged in via AD user name, and thus getting AD security groups (this has broken on upgrades in the past).

I am getting apps installed via Self Service, and I have a 8-10 GB Adobe CC 2015 install coming down and it is actually installing. Often times when Self Service is funky, that is where I see issues.

Is there a specific thing within / related to Self Service?

I was going to update our PROD JSS tonight, should we wait for 9.81

Thx,

johnk

bpavlov
Honored Contributor

@johnklimeck If history indicates anything, JAMF will probably release 9.81 when OS X 10.11 releases which is currently slated for Sept 30th. That's 6 days from today.

johnklimeck
Contributor II

good point

johnklimeck
Contributor II

So, it looks like I was premature, now for whatever reason, I cannot install / run anything from Self Service, in 9.8

sudo jamf manage Error installing the computer level mdm profile: profiles install for file:'/Library/Application Support/JAMF/tmp/mdm.mobileconfig' and user:'root' returned -915 (Unable to contact the SCEP server at “https://server.domain.com:8443/CA/SCEP”.) Problem installing MDM profile. Problem detecting MDM profile after installation.

johnklimeck
Contributor II

This is on a test 10.11 Mac, that's what 9.81 is probably for.

I may just wait for 9.81

Ghanbarzadeh
New Contributor
New Contributor

This is a known issue with the 9.8 binary which causes a partial migration from the old location to the new. When this issue is encountered, it causes the launchd services that we use to manage a machine to point to the old location after the executables have been migrated. We depend on these services for various things to work, such as recurring checkin, and installing policies through self service. We have been able to reproduce the issue at JAMF and are working on a fix. In the meantime, if your affected machines have SSH enabled, the most efficient thing to do would be to use Casper Remote or a script to remote into each machine, one by one, and run the following commands:

rm -rf /Library/LaunchDaemons/com.jamfsoftware.checkForTasks.plist
jamf manage -rebootIfNeeded -deleteLaunchdTask

and then reboot the machine. This will ensure that the launchd services are pointing again to the correct location, and that they are reloaded. I apologize for the inconvenience, and thank you for your patience while we work to resolve the issue.

tinsun
New Contributor II

Hi again!
We can't see a common denominator what has caused this - different OSs, different networks, different uptimes. Of over 900 machines, a minority has failed, but we don't have any exact numbers yet. I'm currently doing recon on all machines via Casper Remote, @bollman having added a new EA which checks for com.jamfsoftware.task.checkForTasks.plist.

There are several problems with this method of checking, of course, since many computers last reported in from a WiFi or home network. We need to replace all IPs in the database with their registered fixed IPs to contect all of them when at work.

Also, the fix posted by @Ghanbarzadeh works, but we will need to ask all users to reboot their computers, since the reboot script will not run. We can't do a shutdown -r +1 on all computers without warning the users. At the same time, we can't really call 100-200 users (guesstimate) and ask them to reboot. Sure, we can throw up a Management Action that asks them to do it, but you know how users tend to ignore those messages.

Thing is, the problem is huge in itself, but it becomes an even bigger issue for us, since we sync the computers' Home Folders on a jamf trigger. This means that a big number of users no longer will receive file sync and backup. This has the potential to really hurt our reputation with the userbase.

bollman
Contributor II

Note!
The file is called: com.jamfsoftware.task.checkForTasks.plist
not com.jamfsoftware.checkForTasks.plist as Ghanbarzadeh stated.

ImAMacGuy
Valued Contributor II

I did a fresh reimage this morning (wipe and reload) of 10.11 and 9.8, I had the SS issue. I tried @Ghanbarzadeh 's commands and @bollman 's version, rebooted and neither way solved the issue for me.