JSS Infrastructure and Scaling

bmarks
Contributor II

I know every environment is different, but JAMF doesn't have a ton of info on this topic so I'm hoping maybe to get some insight into how different places have scaled their infrastructure. The reason I ask is because we recently completed a project to put our JSS in the DMZ. Historically, we really only used our JSS for imaging, but now we're consolidating all of our Mac management tools under Casper and we'll be doing a lot more with our JSS. We have 16,000 Macs, no iOS devices. Our JSS is as follows:

  • Two RHEL VM's behind a load balancer in our DMZ
  • One RHEL VM that's our internal management console
  • One RHEL VM for MySQL
  • (They all have 2 CPU's and 8GB of RAM)
  • JSS 9.81, RHEL 6, Tomcat 7.0.65, Java 8u65

This works fine for imaging (we do a lot of imaging) and simple stuff like a few config profiles, but I've already noticed delays in the web interface when trying to test policies scoped to groups of 1500 or so. I've been checking performance graphs in vSphere but nothing seems to strained. One thing I'm not sure of related to Casper is whether to (potentially) throw more resources at MySQL first or Tomcat first. We're lucky that throwing resources at these VM's isn't an issue for us, but I don't want to do that just for kicks.

I'm certainly not looking for a solution because, like we all know, every place is different. I'm just hoping to get data that I can maybe use to help fine tune our JSS because sometimes the best data is via comparison.

20 REPLIES 20

bmarks
Contributor II

One thing I have already done, for the record, is make sure all of the local settings have been optimized like max database connections, RAM allocated to Java, etc. (basically all the stuff covered in the Admin guide and the CJA class.)

jrserapio
Contributor

Id like to hear more on this as well.

bentoms
Release Candidate Programs Tester

Might be something in this

I know there will be a few other videos on this from JNUC's past

ryanstayloradob
Contributor

How long of a delay are you experiencing in the web interface? In my experience, that's usually a tomcat resource issue. If I remember correctly, according to JAMF a single JSS can usually handle up to 10,000 nodes before another one is needed. Given you're at 16,000 and growing, you might want to spin up another tomcat instance.

Are you keeping your MySQL tabes optimized and repaired on a regular schedule? How large is your database? I've seen instances where tables that are abnormally large will cause performance issues. Probably wouldn't hurt to throw additional memory/CPU cycles to MySQL.

were_wulff
Valued Contributor II

@bmarks

This might be something that it'd be a good idea to get in touch with your TAM about; scaling guidelines have changed a bit since JSS 9.5, so some of the older posts, videos, etc...aren't necessarily all that accurate any more, and may not be as helpful as they once were.

Your TAM will be able to set up a time with you to go over your environment in a bit more detail than would be posted to the forums here, then would be able to make some scaling recommendations tailored for your environment based on what was found and what future plans might be.

You can get in touch with your TAM by either giving them a call, sending an e-mail to support@jamfsoftware.com, or by using the My Support section of JAMF Nation.

@bentoms Strangely enough, I had that exact same outfit from the picture on yesterday. I may need to expand my wardrobe a little. :D

Thanks!
Amanda Wulff
JAMF Software Support

bentoms
Release Candidate Programs Tester

Kumarasinghe
Valued Contributor

@amanda.wulff

Can you please put a Knowledge Base article on what are the latest guidelines for current version of JSS.
It would be good to see what has changed and what needs to be done in the latest versions.

I know environments are different but give us a generic instructions (e.g.- 3 WebApps behind the load balancer, dedicated MySQL server, etc...)

bmarks
Contributor II

Well, just for the record, our TAM today said:

"We don't really have blanket, or template scaling guidelines based off of size. As you mentioned, everyone's JSS is a "unique snowflake" so it somewhat depends on how the JSS is being leveraged."

This isn't meant to be a criticism but you can see why I started this thread.

were_wulff
Valued Contributor II

@bmarks @Kumarasinghe

Your TAM is correct in that we don't have blanket templates or a generic KB for those reasons as it really does differ wildly depending on how the JSS is being used in addition to how many devices are being managed (not to mention how widespread it is; scaling for a JSS that has everything in one location would be vastly different than scaling for a JSS that has managed devices spread across the globe), which is why we usually recommend scheduling some time to go over your specific environment.

Have you had a JSS Health Check at all this year? A large chunk of the JSS Health Check is having a conversation about how the JSS is being used in your environment, checking current scaling, having a quick chat about future plans (i.e. if you're planning to add significantly more devices, expanding the area in which you'll have managed devices, planning to set up an external facing JSS if one doesn't already exist, etc...) and making recommendations for scaling based off of those findings.

If your TAM didn't mention the possibility of setting up a JSS Health Check when you spoke to them yesterday, I do apologize for that oversight.

Amanda Wulff
JAMF Software Support

bmarks
Contributor II

Ok, look, I'm not trying to be a pain, but today, contrary to the past few posts, I received a JAMF PDF that does have the exact type of info I was hoping to get all along. So, I don't want this to devolve into semantics, but for the record a JAMF doc does exist and, while noting clear caveats, it does actually have some resource guidelines based on the size of an environment. I'm sorry, but this was like pulling teeth.

Aziz
Valued Contributor

Hey @bmarks is this PDF available online? I would love to take a look.

bmarks
Contributor II

It's not online and I wouldn't feel comfortable mentioning its contents because I assume JAMF makes it difficult to get for a reason. So, I guess I'd recommend getting it from your TAM like was mentioned (and then not) earlier. But, at the same time, it's still pretty frustrating to get the runaround about whether this even exists (and, just to be clear, this thread wasn't the first time I asked.) Even with the caveat that every environment is different, I'm not a moron and I found this to be great info for a starting point. At the very least, it was something I could take to our team that manages ESX to use as leverage for getting more resources allocated.

tomt
Valued Contributor

I'd love to get a copy of this PDF also.

dgreening
Valued Contributor II

It IS a very informative PDF for sure. WAY better than the prior version. Your TAM should be able to get it to you.

Aziz
Valued Contributor

Contacted my TAM and received the PDF.

alanmcseveney
New Contributor

For that number of machines I'd do bare metal servers. What kind of storage is your database housed on, and how does the host access that storage?

bmarks
Contributor II

Unfortunately, that's not an option for us. We phased out all of the bare metal in each of our offices so ESX is our only option. I'm at the mercy of our infrastructure team. And, I'm not sure of the storage details, to be honest.

alanmcseveney
New Contributor

Well, the storage backend for MySQL is going to matter a lot. It should be fast, fast, fast. I wouldn't recommend any backend storage that is served over NFS. It should be block level, at least iSCSI if not DAS. It would preferably be on SSD and RAID 10, not RAID 5 or 6.

alanmcseveney
New Contributor

I had a bunch of slow queries until we did a healthcheck, and I seemed to have some cleaned up after the healthcheck and some new slowdowns afterwards. I couldn't list a prestage imaging log at all until healthcheck. After healthcheck my Change management logs became slow and are still slow. Unfortunately the JAMF database management seems a little goosey loosey. Depending on what versions of the JSS your database has lived through seems to determine some unpredictable behaviour. I wish it were better at enforcing a consistent schema and data integrity below it, and each JSS would be less of a "unique snowflake".

Also if you haven't installed it already, install htop and keep an eye on the memory usage when you get a slow query.

bmarks
Contributor II

@alanmcseveney That's very helpful info too because "Custom RAID" in that doc could mean anything. Thanks.