JSS issues after a period of idleness

tlarkin
Honored Contributor

It seems that every year after we come back from winter break the JSS seems to freak out. I have to flush logs, reapply TomCat memory allocations and reboot the servers/distrubtion points and some other various tricks to get it back up and running. It won't parse auto run data, and it asks for authentication when netbooting to image machines....

Anyone else experience this stuff? I mean it was working 100% before break with no issues, now it has been idle for 2.5 weeks and I come back and it seems to not work right.

Thoughts/experiences?

-Tom

6 REPLIES 6

Kedgar
Contributor

Tom,

I'm not sure how your server is setup so it is a bit difficult to point out any possible causes. I have not yet experience any major issues like this with my JSS, however I've only been running it for a short while now.

Are you running any other services or apps from your JSS? In our environment, my JSS also hosts Symantec Management Console (a PHP web app and mySql Database)
Do you see anything out of the ordinary in the log files during this time out?
Does your server rely on mounting an external share, or is an essential service hosted on another box? My JSS connects to a RHEL5 MySql server for it's database

I ask these questions because it may be possible (depending on your configuration) that another application or server may be the cause of the issue.

tlarkin
Honored Contributor

The JSS is a stand alone xserve with 8 gigs of RAM just running one OS X Server service, which is AFP. TomCat is running but that is third party (jamf) and MySQL is as well. The only thing this server does is be the JSS. I have 4 gigs of RAM dedicated to TomCat and it is running in 64-bit mode. My server set up is very very simple and straight forward.

I do have 6,000+ clients connecting to this server and checking in every day via casper binary, and my MySQL database is about 6 gigs in size. It is a dual quad core CPU set up, and I have hardly ever seen the CPUs spike over 300% usage (since it is dual quad core, that means I am using about 50% of the CPU).

It seems that every year (this is my third year now) that when I come back from break, nothing wants to work right with Casper, and it takes a few days of me restarting services for it to work again. Not even sure if this is related to Casper, I mean I guess it could be infrastructure related but it happens almost every time after winter or even spring break. It seems anytime it goes 1+ weeks idle with no users connecting to it, it doesn't want to run right when they come back.

Not a clue as to why this happens

dustydorey
Contributor III

We have almost an identically setup as Mr. Larkin with about 5k clients.
And do experience some of the same issues. Database related maybe?

Dustin Dorey

Technology Support Cluster Specialist

Independent School District 196

Rosemount-Apple Valley-Eagan Public Schools

dustin.dorey at district196.org

651|423|7971

Kedgar
Contributor

Interesting, Maybe I have not seen this because I have a much smaller install base... under 300 clients connecting.

dustydorey
Contributor III

I was thinking about this issue last night and a thought came to me in
regards to the JSS' bad behavior after break. Many of the same
symptoms we experience after break were similar to times when we simply
had too many policies set for startup, and scheduled tasks running here.
We were having times where for an hour or two a couple thousand machines
were reporting into inventory, executing policies etc... and all were
trying to do this at the same time. Which led to the JSS hanging up,
Java process spiking on the main server and all sorts of JSS dismay.
It didn't matter that we threw 8 Gigs at tomcat running in 64 bit mode
it could still hang up. So the solution for us was to spread out our
inventory reporting, cut down the number of policies set to trigger at
startup. And begin exempting policies from running during our busiest
times. Now this cleared up our issues of regular freezing and we only
had a slight hiccup yesterday when we came back. I'm wondering if the
next time you have a big break, maybe spring break, it would help to
turn off any policies and scheduled tasks that do not NEED to run for a
couple days. For instance if you could stagger things like inventory
reporting coming back online and avoid policies that are not essential
and slowly turn them on over the course of a week or two. It would
make sense that the Database and server might get bogged down if there
were a bunch of machines coming on line all triggering scheduled tasks
and policies that would normally have been spread out over a couple
weeks time on one day.

Just a thought.

-Dusty-

Dustin Dorey

Technology Support Cluster Specialist

Independent School District 196

Rosemount-Apple Valley-Eagan Public Schools

dustin.dorey at district196.org

651|423|7971

tlarkin
Honored Contributor

I did all of that already. My database used to be like 14 gigs in size
because of the exact things you are saying. I increased the max packet
size, I deleted a bunch of non essential smart groups and switched to
saved searches, and I cut out a bunch of logs and ongoing policies. Still, the problem only exists if the JSS is idle for like a week or
more, otherwise it is semi smooth sailing.