JSS 9.3 on Mac 10.8 = Java _appserver killing the system.

Chris_Hafner
Valued Contributor II

Alright. Since updating to JSS 9.3 (we've made each and every small jump since 9RC1) I've noticed (alright, I've been bashed over the head) Java's _appserver eating over 1000% CPU and running up inactive memory until the whole JSS locks up. Users utilizing "Self-Service" now cause a noticeable jump in CPU usage and enough RAM(inactive) so that if I don't watch it (think purge command) it will eventually lock up.

Running Casper Imaging will lock up the JSS within a few minuets. Imaging a dozen computers will lock up the machine (Showing full inactive memory and _appserver CPU utilization) within seconds.

Anyone else seeing this type of behavior?

P.S. here's my setup
JSS 9.3 and primary DP on
Mid 2010 MacPro w/8GB RAM running 10.8.5 and server 2.2.2
The Boot device happens to run the JSSs MySQL DB but that's a nice OWC SSD.
The Primary DP is running on an Arecra Internal RAID.

NetBoot and SUS on
Mid 2010 MacPro w/8GB RAM running 10.9.2 and server 3.1.1
Also, running a very fast Arecra Internal RAID for NetBoot and SUS/Caching Server

In any event, I've tried most quick things off the top of my head and am probably about to call my friends over at JAMF but, I figured that I'd throw this out there as well!

1 ACCEPTED SOLUTION

Chris_Hafner
Valued Contributor II

And, an update. Last night I upgraded the JSSs OS from 10.8.5 to 10.9.2 AND Server from 2.2.2 to 3.1.1 (I'm not looking at it, but the latest version) and all seems well. Folks mentioning AFP problems were quite correct. The old OS, for some reason wasn't letting go of any AFP connections and so fun ensued. In any case, all is now well. I just wish I knew what triggered the issues on 10.8.5 in the first place. I had been running that server in that configuration for some time (several months). Oh well. In any event I wanted to let everyone know that this is solved for me!

View solution in original post

33 REPLIES 33

galionschools
Contributor

Seeing it here as well. Installing iOS apps through Self Service shoots the usage up from 50-60MBs idle up to the max which I had to bump up to 4GB from 1GB. As soon as the cpu and memory usage maxed out anything Casper related was inoperable. Had to force my field tech who happened to be installing PARCC TestNav and AIRSecure Test down to two iPads at a time and even then it maxed several times.

Running Casper on a 2009 Intel X-Serve with a single cpu and 12GB of ram. Nothing special just a plain jane X-Serve.

Chris_Hafner
Valued Contributor II

That confirms a few suspicions of mine, thanks! I'm going to work with JAMF this morning and I'll past back what I find.

GabeShack
Valued Contributor III

We were seeing this on a mac mini running 10.7 with 16 gigs of ram. We then moved to a 6 core mac pro that is connected to two gigabit ports with 64 gigs of ram running 9.3 and 10.9.2 and the latest version of server 3.1.1. We initially had some trouble with the thunderbolt 2 promise array that we were using as a DP directly connected to the Mac Pro but then reverted to my older thunderbolt 1 promise array after seeing tons of errors. The java issue has since gone away and has not returned. I used to have to use the purge command in terminal to clear the runaway java process on the mac mini making it unresponsive like you stated (or just rebooting).
Not sure if this helps, just letting you know your not alone...
Gabe Shackney
Princeton Public Schools

Gabe Shackney
Princeton Public Schools

Chris_Hafner
Valued Contributor II

Interesting and thanks for the input. Any idea what was cleared up in the move?

Chris_Hafner
Valued Contributor II

P.S. I have this in with JAMF. Unfortunatly I was only able to send them logs as I'm presently unavailable for remote support. I'm hoping to post their findings soon.

Chris_Hafner
Valued Contributor II

FYI Running the following command in MySQL

select device_id,count() from mobile_device_management_commands where apns_result_status != 'Acknowledged' group by device_id order by count() asc;

revealed the following.

+---------------------+
| device_id | count(*) |
+-----------
----------+
| 502 | 1 |
| 501 | 1 |
| 549 | 1 |
| 837 | 1 |
| 841 | 1 |
| 849 | 1 |
| 504 | 1 |
| 774 | 1 |
| 412 | 1 |
| 867 | 1 |
| 847 | 1 |
| 690 | 1 |
| 686 | 2 |
| 846 | 2 |
| 848 | 2 |
| 852 | 2 |
| 500 | 2 |
| -1 | 7163 |
+-----------+----------+
18 rows in set (0.03 sec)

so I've got at least 7163 notifications waiting to be sent. We've cleared that out but that was just one issue amongst many. A lot of logging has been done and the fine folks at JAMF support are scouring the logs.

Chris_Hafner
Valued Contributor II

A bit of an update. It looks like AFP itself may be the culprit. Why that was triggered but he upgrade to 9.3 I don't know. I've switched my secondary DP (running via SMB) to the primary spot until I can upgrade the JSS (which should happen tonight). I will report back soon on this. FYI

mpermann
Valued Contributor II

@Chris_Hafner, so the hope is that upgrading the OS on your JSS to 10.9.2 will correct the JAVA issue you're having? I'll be interested to hear if that works. We currently are planning to upgrade to version 9.3 on our production system which has Mac OS 10.7.5 tomorrow. So I'll be very interested to hear how this works out for you. It sounds like @gshackney was running a similar setup to ours prior to his switch to a Mac Pro with 10.9.2. I'm wondering if an OS upgrade would have solved his problem.

Chris_Hafner
Valued Contributor II

Ahhh yes. So let me help clean up my explanations. I've come around to the following diagnosis.

• AFP connections to the JSS are not being released. This causes many bad things of course. The first of which being that the computer runs up "inactive memory" but cannot release it as the AFP connections are still open. Once memory fills the java _appserver eats up 1000%+ CUP usage and the server locks up effectively. The box still runs, but the services are toast.

I've been doing some database cleanup and the like. Helping clear out any build up APNs as I've mentioned before. I had over 7k sitting in there the other day and now we're floating around the 80ish range after about 24 hours of watching.

That said, There are folks running 9.3 on 10.9.2 and NOT having this issue. Regardless, I should be able to verify this as a solution by tomorrow morning.

OH, 10.7.x is apparently known for this type of AFP issue. I would contact JAMF support BEFORE upgrading to any version of 9 on a 10.7 box. From what I'm reading, 10.9 seems to have the least issues when running 9.x. Personally I was waiting until to do the upgrade until this summer, but now my hand seems forced. I'll let you know.

GabeShack
Valued Contributor III

I think a lot of the AFP problems are more inherent to 10.9 then to Casper 9.3. I had a bunch of issues to overcome with the files share settings on our DP's and making AFP the ONLY default since server wants to use both smb and afp by default.

I can't remember if I had Java issues when we migrated our JSS from the Mac mini running 10.7.5 to the MacPro running 10.9.2 (before we upgraded to Casper 9). I would do this step first and fix any issues before moving to Casper 9.

We definitely experienced many problems with the upgrade (especially with imaging) and I would thoroughly test your settings in a closed environment for a few weeks (or months lol) before going live with Casper 9.

Again I'm not sure if the OS update solved the java issue or the added hardware of the MacPro with extra ram solved this. But I do know that Casper 9 handles its data differently and may not use the same amount of ram for these processes.

Gabe Shackney
Princeton Public Schools

Gabe Shackney
Princeton Public Schools

GabeShack
Valued Contributor III

PS once you upgrade the JSS to 9.x you can't revert back without a lot of headaches and I mean a loooottt of headaches.

Found this out the hard way....

Gabe Shackney
Princeton Public Schools

Gabe Shackney
Princeton Public Schools

mpermann
Valued Contributor II

Thanks for the response @Chris_Hafner. I had contacted JAMF about upgrading to 9.3 and they didn't indicate any potential issues such as this. I'll contact them again for further guidance before the upgrade. Do you have HTTP downloads enabled or are you using AFP exclusively? We're using AFP and HTTP. So I would expect all of our Self Service policies would be using HTTP so maybe we wouldn't run into the problem with Self Service stuff but I would imagine we would encounter the issue when doing imaging.

Chris_Hafner
Valued Contributor II

Presently I am using AFP and SMB for various reasons. Mostly because we block SMB at the firewall but not AFP. Thus, my Primary DP is AFP and the secondary is SMB. I've not in three years quite felt the need to turn on HTTP downloads thought this is tempting me. Besides, resumable downloads would be neat. So far, I've been fortunate enough not to have had issues like this in the past.

mpermann
Valued Contributor II

Thanks for your reply @gshackney. I've emailed my TAM about the issues discussed in this thread for some guidance on what we should do. We want to be able to start using Apple's DEP and VPP2 programs so we have to upgrade to the new version to get those features. I just don't want to give myself a whole lot of new problems to deal with if I can help it.

mpermann
Valued Contributor II

@Chris_Hafner, thanks for clarifying that you don't have HTTP downloads enabled. That's useful information. The added benefits that HTTP downloads brings are nice. But, if you've gone this long without needing them then I'm not sure I'd want to turn them on either. Why mess with something that's working.

Chris_Hafner
Valued Contributor II

9.3 is working well enough for me FYI... though there are new and different issues still regarding profiles that I'm not liking. Then again, I've run 9.x in production every step of the way!

lsmc08
Contributor

@Chris_Hafner and the rest of the folks here... this issue is primarily attributed to afp. On my test JSS, 16GB, i7, running both 10.8.5 and then 10.9.2 under JSS 8.73 and now 9.3, the inactive memory issue is unbearable. Separate from a Casper/JSS perspective, I even mounted a share hosted on this box to several clients and copied about 10GB of data simultaneously on 5 clients, and while the copying was taking place, yep again, the RAM was swallowed up leaving the system with only 200+ MB of free RAM.

For my production JSS (8.73 & 10.9.2 with 16GB of RAM), I ended up removing DP services all together. I have a SMB DP on a Windows Server box. I then configured a secondary DP on an OS X box to balance things out on different floors when executing policies with large files. On that secondary box, I'm running the memory cleaner app. The app is not that so ideal, but I don't have to constantly purge the box when inactive memory gets high.

@Chris_Hafner thank you for posting and if you find a better solution on afp or see improvements, let us know.

Hopefully Apple fixes this with 10.9.3 and a new version of the Server app.

Chris_Hafner
Valued Contributor II

How interesting. I may bring the box down to bare metal. Only because I happen to KNOW that others are running 10.9.2 "server" with casper 9.3 on AFP without issue. regardless, I can go to SMB with HTTP downloads just as well. Bleh... more and more I wish I had gone the ..nux route from the beginning!

In any event I'll keep you posted. I've got a KVM test environment but DAMNIT I feel like making this work.

were_wulff
Valued Contributor II

Hi @lsmc08 & @Chris_Hafner,

Thanks for the update on that! We did know there were occasional issues with AFP and memory use in 10.7.5 and 10.8.x, but hadn’t heard reports of it happening still in 10.9.x.

I haven’t yet been able to make it happen on my 10.9.2 test server (Just my luck, the one time I CAN’T break something is the time I want to!), which also hosts my main test JSS Distribution Point, but I wonder if we go into Server.app >> File Sharing >> Open up the properties for the Casper Share >> uncheck “Share over AFP” >> Check “Share over SMB” >> Ok.

Then, go into the JSS >> Computer Management >> File Share Distribution Points >> Edit the DP >> File Sharing >> Change Protocol to SMB >> Save.

In my test environment, I saw a slight drop in the amount of memory the JAVA _appserv was using after turning off the Share over AFP option, but I’m leery of putting too much stock into that as I haven’t yet been able to reproduce the main issue of Java spiking with its memory use while AFP was enabled.

If you get better/different results, please let us know.

Thanks!

Amanda
JAMF Software Support

Chris_Hafner
Valued Contributor II

Just so you know, this issue became immediately apparent for me when it started. (Please bear in mind that my primary DP resides on the same box (different volume) as the JSS. Imaging a single machine would bring it to a crawl.

were_wulff
Valued Contributor II

@Chris_Hafner,

Mine is steadfastly refusing to do the same thing.

10.9.2 (13C64) Mac Mini
16GB RAM
Share is on the same Mini on the same drive, same partition.

I've been trying to make it chew up the memory via Netboot and regular imaging, but it keeps behaving and working like we'd expect it to work: By not chewing up all available memory and throwing everything into a tailspin.
Mine seems to be working over either AFP or SMB, which isn't exactly helpful when I'm trying to make it exhibit the same memory eating behavior. I certainly do know that it happens, we saw it happen on your server, and others here report seeing it, I just wish my own server would misbehave for a change to make the troubleshooting for other workaround options a little easier. :)

That's part of what's made me curious about whether or not unchecking the "Share over AFP" box in server app might help for Mavericks servers where AFP just isn't behaving at all. If OS X still has issues in 10.9.x with AFP, which it sounds like it does, it's possible that turning off AFP in the Server.app and just checking the Share Over SMB might help.

Amanda
JAMF Software Support

Chris_Hafner
Valued Contributor II

Alright... how about this. Is anyone who's experiencing this issue also seeing the following type of error in console
com.apple.launchd (com.jamfsoftware.jamfdsenroll[808]) exited with code 1

On top of that, ado you use JDS? I do not, yet I'm getting these JDS garbage errors. I'm looking for some form of corroboration here.

Chris_Hafner
Valued Contributor II

And, an update. Last night I upgraded the JSSs OS from 10.8.5 to 10.9.2 AND Server from 2.2.2 to 3.1.1 (I'm not looking at it, but the latest version) and all seems well. Folks mentioning AFP problems were quite correct. The old OS, for some reason wasn't letting go of any AFP connections and so fun ensued. In any case, all is now well. I just wish I knew what triggered the issues on 10.8.5 in the first place. I had been running that server in that configuration for some time (several months). Oh well. In any event I wanted to let everyone know that this is solved for me!

RobertHammen
Valued Contributor II

Hmmm… explains the issues I was seeing with my test server (10.8.5/Server.app 2.2.2), which I finally upgraded from 8.73 to 9.3 early this week. Server would to "out to lunch" and be relatively unresponsive (would respond to pings, couldn't VNC into it, could ssh and get a login prompt but never a shell, etc.). I ended up building a new server on Mavericks 10.9.2/Server.app 3.1.1, copied over the CasperShare and restored the DB, and all is well so far, but it's only been a couple of days...

galionschools
Contributor

I'm guessing our issues are database related since we don't currently manage our OSX fleet with Casper. We don't have any distribution points setup at all either.

The server is running 10.8.5 it does serve other services such as ASUS and Netboot (Deploy Studio), but our memory/cpu usage issues correlate with iOS Self Service usage. We haven't had the issue this past week, but I'm not declaring it gone.

cdenesha
Valued Contributor III

@galionschools Are you pushing the testing apps as In-House apps? Are they stored in the DB or are they on a web server?

lsmc08
Contributor

@amanda.wulff, on my 9.3 test JSS, 10.9.2, Server 3.1.1, I tried your SMB suggestion for the DP... same results... imaged 1 machine and from 10GB... 3 minutes into the imaging/coping process, the available RAM went down to 1GB.

On a semi-related note, on my 8.73 production JSS, with no DP at all, 10.9.2, Server 3.1.1, and afp OFF... with 16GB of RAM, every morning the box shows like 5+GB of inactive RAM. I run sudo purge... the box behaves well for the rest of the day. The next morning all repeats again.

Just throwing it out there, the only thing I can think of of heavy taxing on the JSS box overnight are the backup and the machines reporting to the JSS...not sure then why the box shows like 5+GB of inactive RAM every morning?

cdenesha
Valued Contributor III

Weird.. inactive RAM is memory that was used, is no longer being used, but is saved in case the application requests it again. It is essentially 'Free' memory.

Are you seeing low performance that gets better after the memory purge? If you don't do the purge, does the box not 'behave well'? Seems strange that it would be the Inactive Memory. Unless something has changed with Mavericks (I'm speaking of how the OS has worked up until ML).

thanks,

chris

galionschools
Contributor

Not in-house apps they're App Store apps. They aren't stored onsite either.

Chris_Hafner
Valued Contributor II

P.S. @amanda.wulff I just wanted to take a moment to thank you for all the work you've done with all of us on this. Amanda (along with a few others) was working with us on this issue the entire time on top of her posting here. I didn't put most of those conversations up because I, for some reason, didn't realize that you were the same Amanda that I was working with (Even though you said so). So, here's to you Amanda. You ROCK!

@lsmc08 OK, So now that I'm running both Casper related servers on 10.9.2, I've had several observations on RAM utilization and usage/reporting. First of all, are you experiencing any issues on the Mavericks servers even when they say that all Memory is used? Both of my boxes report using just about 100% of the memory, with almost no memory pressure. I have no idea what this means but my graphdat is showing my memory utilization at around 50%. I'm not sure how Apple really defines memory usage and memory pressure. I'm sure it's out there. I just wonder if it's finally doing what it's advertised, i.e. the ability to intelligently utilize all RAM without adverse performance.

were_wulff
Valued Contributor II

@lsmc08 - Thanks for the update on that! I’ll jot it down in the notes I’ve got for this particular issue.

@Chris_Hafner - Thanks! I tend to use the JAMF Nation forums a lot for troubleshooting, as well as giving Technical Account Managers a heads up if I see one of their customers having an ongoing issue. A lot of times there’s something here that someone else has done or tried that I’ve either overlooked or wouldn’t have even thought to look at.

cdenesha
Valued Contributor III

@Chris_Hafner

Alright... how about this. Is anyone who's experiencing this issue also seeing the following type of error in console com.apple.launchd (com.jamfsoftware.jamfdsenroll[808]) exited with code 1 On top of that, ado you use JDS? I do not, yet I'm getting these JDS garbage errors. I'm looking for some form of corroboration here.

I'm seeing this on my test box but not on live. I just wiped out my test box and started over as I used a favorite app LaunchControl (http://www.soma-zone.com/LaunchControl/) to view how that launchd entry was configured. I discovered /usr/bin/jamfds, and also /Library/JDS. I do not use JDS.

It turns out the 9.25 JSS installer installed it by default. It is now gone on my test box.

chris

Chris_Hafner
Valued Contributor II

I like it. Fortunately it was solved for me by the time I ended up at 9.32. Thanks for the update though! It's going to be useful to many folks I'm sure!