Error running recon: Connection Failure

johnnasset
Contributor

With every policy that installs a package, we always have a handful of machines that will successfully install the package but fail to run a Recon post-install. These are not the same machines each time. Here is a sample log:

Executing Policy Adobe Flash 11.9.900.170...
Downloading Adobe Flash Player_11.9.900.170.pkg...
This package is a PKG or an MPKG, and the index.bom file is not found. Attempting to open the package as a flat package...
Downloading http://xxx.xxx.org/CasperShare/Packages/Adobe%20Flash%20Player_11.9.900.170.pkg...
Installing Adobe Flash Player_11.9.900.170.pkg...
Successfully installed Adobe Flash Player_11.9.900.170.pkg.
Running Recon...
Retrieving inventory preferences from https://xxx.xxx.org:8443/...
Locating accounts...
Searching path: /Applications
Locating package receipts...
Gathering application usage information...
Locating printers...
Locating software updates...
Locating plugins...
Error running recon: Connection failure: "The host xxx.xxx.org is not accessible."

If it was a network issue I would assume that the package would fail to install as well. Anybody else seeing this? We are on version 9.22.

92 REPLIES 92

corbinmharris
Contributor

I'm seeing this as well, more so with 9.22

Or I get, "Executing Policy Update Inventory... Running Recon... Error running recon: Connection failure: "The request timed out."

andrew_stenehje
Contributor

FWIW- we often see these responses when machines do not have proxy settings turned on because we have a proxy in place.

ClassicII
Contributor III

What do you get when you try to check the connection to the jss on that machine?

Or if you try to run a manual recon?

or load https://xxx.xxx.org:8443 in a webpage?

sudo jamf checkJSSConnection
sudo jamf recon

yan1212
Contributor

I keep seeing this quite a lot since going to 9.21 and also in 9.22..

I also started seeing quite a lot of errors with clients failing to connect to http distribution points and failing over to afp, which works ok. I don't yet know if this is related though..

peo
New Contributor

Related to https://jamfnation.jamfsoftware.com/discussion.html?id=9041 ?
/Peo

bmak
Contributor
Contributor

Error message is: Error running recon: Connection failure: "The request timed out."

I also tested a manual recon by ssh-ing in and running "sudo jamf recon -verbose" on a workstation that failed to run a recon after a policy. And it recon perfectly fine.

I wonder if this is a bug....

Are people experiencing this issue anymore with the updated version of the JSS namely 9.3 and 9.31?

tnielsen
Valued Contributor

I too am seeing this very often. Almost daily on what seems to be random computers.

perkins
New Contributor

I am seeing the Error running recon: Connection failure on random clients.

We are on JSS 9.3.

Manually running a recon works fine.

aamjohns
Contributor II

I would like to add I am seeing this more and more (currently I am on 9.31).

Connection failure: "The host <our host name>.xxx.xx.xxx is not accessible."

It just happened this morning to me with my computer. Right after I saw it in the log I ran

jamf checkJSSConnection

and that came back ok. I manually ran recon and that worked.

Also, I have a lot of systems that will not run recon as specified in policies. For example, run a software update package and then update inventory is check. The software installs, but no recon. It has become so prevalent that I have un-checked that option in the policy and I use an 'after' script to run recon.

Finally, I am getting this a lot now:

Installation failed. The installer reported: installer: Package name is <package name> installer: Upgrading at base path / installer: The upgrade failed (The Installer encountered an error that caused the installation to fail. Contact the software manufacturer for assistance.)

It may work on one computer and not another.

I wish I had a way to diagnose what is going on with all of these issues.

tnielsen
Valued Contributor

Just to chime in. I am also seeing this daily on "random" computers. I'm wondering if the network connection is dropping during recon or maybe the user is unplugging their network cable. I'm not sure, but it's still happening often.

using version 9.3

mpermann
Valued Contributor II

I'm seeing the same thing here. On April 10th we upgraded from 8.73 to 9.3 and I started seeing this problem. We've since moved to 9.31 and the problem persists. I've seen this issue with kiosk computers that are connected with ethernet to our network and set to never sleep. I've had other bigger problems with the upgrade to 9.3 that I haven't spent much time trying to figure out what is causing this one.

bentoms
Release Candidate Programs Tester

Add me to the list whom are seeing this more since 9.3.

Haven't moved to 9.3.1 yet nor have I logged it or looked into it.

Anyone opened a ticket with support?

aamjohns
Contributor II

I have a number of test machines, some of which are exhibiting these symptoms, and they are awake and connected via Ethernet. Only some do this, and not necessarily all the time, I may get failures and then later the policy works.

tep
Contributor II

I am also seeing this, post 9.3 upgrade. I opened a ticket with support, but the problem is random enough that they couldn't isolate a source. I'm hoping that 9.31 will help.

bmak
Contributor
Contributor

I also have logged a support ticket with JAMF as well and due to the nature of the problem being so randmon. We were not able to replicate it.

We did look at the system log file at the time of the occurance and did note the error message which is

May 12 02:56:58 XXXXXX kernel[0]: AFP_VFS afpfs_unmount: /Volumes/CasperShare, flags 0, pid 7486
May 12 02:56:58 XXXXXX kernel[0]: ASP_TCP Disconnect: triggering reconnect by bumping reconnTrigger from curr value 0 on so 0xffffff8013d8eb80
May 12 02:56:58 XXXXXX kernel[0]: ASP_TCP Detach: Reply queue not empty?

After googling it seems that this problem is not isolate to just OS X 10.9 (which we're using here) but this also occurs in previous version of OS X. And I do recall reading that it could be due to Apple phasing out AFP in favour of another protocol.....

bentoms
Release Candidate Programs Tester

Ours are http distribution points...

Are we all seeing this when deploying non-flat pkg's?

analog_kid
Contributor

I'm seeing this issue as well (policy executes fine except for recon submission). Running JSS 9.3.

wyip
Contributor

I'm on 9.31 and am seeing this too, both when a policy installs a package or with the policy that just runs the daily recon. I've also seen this when installing a flat pkg built with Composer.

I opened a ticket with support and we looked at my load balancer first (we're using pound). We tweaked some settings there, but I'm still seeing about a dozen timeouts a day (out of 2k+ managed systems). If I ssh to a system and manually run a policy or a recon, it works fine.

khoppenworth
New Contributor

Goodness, so glad I found this post. I thought something was going wrong w my server. I'm having the same issues. Something else I'm curious about is what server everyone is using? I saw some mentions of 10.9 server having connectivity problems, but I upgraded to 10.9(.3) at the same time I upgraded to Casper 9.31, so not sure if that's related...

mpermann
Valued Contributor II

@khoppenworth, we also upgraded our OS to Mac OS 10.9.x when we upgraded our JSS to 9.31. We were originally on Mac OS 10.7.5 and JSS 8.73 and didn't experience these issues. Is everyone that posted here using a version of Mac OS to host their JSS?

johnnasset
Contributor

10.9.3 server and 9.31 for the JSS for me. Still seeing this daily.

tjwolfui
New Contributor

We are also seeing this issue on a 9.3 JSS running Windows Server 2008 R2 with an HTTPs File Distribution Point utilizing separate 10 Gb NICs for the JSS traffic and HTTPs Distribution traffic.

This is a very random issue as well for us but it seems to rear its head more when we do a mass policy deployment of a piece of software.

Our OS X 10.9.x AFP Distribution Points appear to return better results when sending in their recon results back to the JSS. We believe this could be a network traffic capacity difference between our AFP Distribution Points and our Primary HTTPs Distribution Point since our Primary Distribution Point has a 10Gb NIC but our AFP Distribution Points have only 1Gb NICs. In theory our 1Gb Distribution Points would distribute software more slowly than our 10Gb Distribution Point which would then cause the recon process on each client that is waiting to download the software prior to doing its recon process to space out the Recon data being sent to our JSS causing less errors when talking to the JSS.

Two thoughts my team has on this issue:
1. We are reaching a thread count limit when systems are reporting recon data to the JSS during mass deployments when we are utilizing our 10Gb Distribution Point
2. There is a bug in the JAMF binary on the client where it drops the connection for some reason while trying to send recon data to the JSS during a policy deployment

wyip
Contributor

I'm seeing this on Windows Server 2008 R2, but was thinking about migrating to Linux since our data center team just announced that they now support CentOS/RHEL/Ubuntu. I figured it was an idiosyncracy of trying to run Tomcat & MySQL on Windows.

ImAMacGuy
Valued Contributor II

i see this too, a lot. even on 9.32.

aamjohns
Contributor II

I'm on Windows Server 2008 R2 now running 9.32. Distribution is over HTTP. Now that I have quit using the 'update inventory' checkbox for recon and use an after script that runs recon, I am getting my inventories fine.

I still have to deal with this issue (no workaround for this one):

Installation failed. The installer reported: installer: Package name is <package name> installer: Upgrading at base path / installer: The upgrade failed (The Installer encountered an error that caused the installation to fail. Contact the software manufacturer for assistance.)

scottb
Honored Contributor

@aamjohns: I see that one too using Win 2008 R2 and SMB DP's. I always figured that this error was caused by an install of a newer version of an existing app that was "open". So if I have Word 14.3.9 open on a Mac and try to install 14.4.2, it will fail with that error. At least that's what I thought was the cause - could be totally off on that...
A lot of our install pkgs are served up only for techs to use during imaging, and normally they don't encounter open iterations. Might have to start adding scripts to kill the apps to see if that helps.

wyip
Contributor

@aamjohns Interesting, I'll try using a script to run recon instead of the checkbox to see if that helps.

I also see the "Installation failed" when installing a flat pkg over HTTP (actually a JDS), but MUCH less frequently than I see the timeouts. The install failed error comes up maybe once a month, whereas the timeouts happen at a rate of about a dozen or so a week.

aamjohns
Contributor II

@boettchs][/url,
Good point. And I have started to investigate that. I have not yet determined that is the issue for certain, but it very well may be the case. I will be sure to keep that in mind and continue my investigation to see if that is indeed the issue.

I recently implemented a script that will not try to install certain updates if the application is open. In this case Adobe Acrobat Pro and Office 2011. I could force close the applications first, but I do not want to do that to our users. So the policy will try to run and if the apps are open, it reports that back and does not attempt the install. I also have this updates set to run at login which should help with the issue, but so many of these people go huge amounts of time without logoff logons.

@wyip,
I hope it works for you. It works very well for me.

scottb
Honored Contributor

@aamjohns: that's the worst part about "pushing" apps as opposed to using Self Service. Some people have 20 apps in the "login items"; some never logout/reboot, so certain hooks are useless too. You can't just nuke an app while they're using it so it's a balance of force and skills on our end. I don't have the advanced skills yet to do some things that would be nice - like telling a user that we need to upgrade X, Y, and Z so please quit them. Working on it, but there are sometimes no simple answers to what seem like simple problems. If we could rely on users to go into Self Service and download items, it would make a lot of this easier.

aamjohns
Contributor II

@boettchs,
Yes, I have all of this setup in self service, and if they went there and clicked on the item, it would tell them to close the applications first. And this method would probably work fine. But they are not doing it like I had hoped.

I agree about notifying people that there are things they need to do because we have updates that need to be installed. And I have been exploring the idea, like you mentioned, of coming up with a way to let people know this through some sort of dialog coming up. I intend to pursue that. Here lately I've been bogged down with other things I have to do but yes, I agree with you, notifying people that there are things that need to be installed, and how to do it.

tnielsen
Valued Contributor

Strangely, while I get this error still, it doesn't seem to be causing any problems. Policies work, self service works. Does anyone have something that ISN'T working because of this error?

jstandre
New Contributor III

We don't seem to be having any issues with it either. It seem like it is just a minor annoyance.

ShaunRMiller83
Contributor III

Not to revive a somewhat old thread but does anyone know if this has an actual defect number?

I am seeing the same thing under a Windows 2008 R2 JSS running 9.31 and was going to open a ticket under the defect number.

It's more of an annoyance for the MacAdmins who get the email alerts than anything else as far as I can tell everything is still working as expected.

kyle_jackson
New Contributor

Has anyone been able to solve this? It's been driving me crazy lately.

aamjohns
Contributor II

My solution is to run a shell script that runs recon and not use 'update inventory'.

chriscollins
Valued Contributor

We see this many multiple times a day and then once I started ssh'ing in and running it on machines that were seeing this a lot it was always taking forever at gathering the application usage data / not getting past it at all. We don't really use this data anyways so we don't need it.

I turned off collecting it and I haven't seen the issue once since.

seabash
Contributor

From a purely manual recon-perspective (not re: policy, etc)...
I see all OS X clients throwing errors every time—though JSS appears to get updated info from clients, so this may just be binary status bug; not a functional bug.

The errors occur at the end of recon, when "Submitting data to http://myjssfqdn:8443/..."

The error is slightly different over Wi-Fi-only (802.1X, fyi)...
"There was an error. Connection failure: "The network connection was lost."

Compare to error over LAN (w/ or w/out Wi-Fi)...
"There was an error. Message has no content."

JSS v9.5.2 (Test) running clustered on OS X 10.8.5 and behind HAProxy (fairly "stock" config).
FYI: checkJSSConnection always reports "The JSS is available."

SeanA
Contributor III

JSS 9.52
Windows 2008 Server R2

I am seeing this problem as well, mostly with recon. The problem occurs when "Gathering application usage information." While I am thinking that the "shell script (jamf recon) rather than update inventory" solution is best, I would also recommend that you look at the size of the application table size (JSS > JSS Information > JSS Summary > check all the buttons > Create, and look for table sizes) just in case the table size is too large.

jconte
Contributor II

I am seeing this on JSS 9.63, AFP distribution points. One thing I am noticing is that it appears to be happening on machines connected via Remote Access, I can tell because the IP addresses being reported are not corporate network, they are the ISP.

Thanks