I've been asked to look @ High Availability & Automatic Failover of our JSS.
I know we can load balance the web side, but what about the MySQL? (I may have missed something).
To achieve this, what have you done?
Is something like Amazon's EC2 viable? (With primary distribution point in our WAN).
Totally forgot/missed that a cloud based solution would not be able to connect to clients over Casper remote due to NATing.
I assume you could get around this by using something like iConnect from Citrix as a remote support tool.
Of course, I'm not familiar with your environment but if you have in place something like a vSphere/vMotion infrastructure then I'd go that route rather than trying to cluster MySQL itself. Trying to keep a virtual machine available using a lower level clustering system would be far easier, IMHO, especially if someone else has already done the work.
In the last environment I architected, I got approval for as many Windows Server VM's as I needed, so I went with one Tomcat instance and one Distribution Point for each geographical location (US, EMEA and APAC). At the end of the project we had 14 Tomcat instances and 14 Distribution Points. Replication was a bit tedious, we had a Mac in every location that was set up to replicate to that location's Distribution Point.
We pressed for Load Balancing, but politics and budget got in the way. So we had a QuickAdd for each location, and we scoped policies by subnets. Looks like v9 is going to make this a whole lot easier.
JAMF doesn't support MS SQL, so we couldn't leverage the existing infrastructure. We ended up going with MySQL replication, with warm failover (since as Steve Wood put it: "You don't need 99.999% up time). One day JAMF might decide to support MS SQL, that'll put all of us in enterprise in a much better position with the ability to leverage existing resilient MS SQL infrastructure.
EMC shares turned out to be a huge PITA, much easier to set Distribution Points up on Windows Server (bonus for HTTP support). ;) MySQL is so much happier on a physical server. I saved a considerable number of hours doing the preparation work myself, including setting up the Windows Server VM's, handling DNS, etc.
Thanks Will & Don.
I actually started thinking of some vCenter solution earlier today.
I work in our HQ which has server rooms either side of a lake, one side has our main server room.. The other a site specific/ HA room.
Our master JSS is running fine & we have a remote DR site that has another mac server that can be restored from last nights JSS backup within hours.
So I'm thinking of moving the JSS's tomcat & MySQL services onto windows VM's on the vCenters hosted @ either side of the lake.
Then somehow load balance them or see if we can have some automated vMotion.
(There is a body of work about identifying which VM's/Nodes are critical & automating the vMotion of them).
I think this will be the simplest way.
We only have 190 mac clients globally, & 2,000 staff. So I don't think I need to go to far with this.
Sounds about right. I'm finishing the initial stages (!) of the JSS back end upgrade project at UAL. We've gone for an entirely RHEL 6.3 based setup for the JSS and MySQL database. The distribution points are still in the "thought experiment" phase and we'll probably go Casper 9 + JDS for that.
However, the basic concepts you already have:
1) JSS in cluster mode.
2) Load balance multiple JSS VM's through a service DNS name.
3) Point all JSS instances at a MySQL cluster.
4) Split your cluster instances across VM boxes and VM sites.
5) Backups backups backups ...
I'm sure there's other stuff to keep an eye on. However it's late, the weekend etc.
Today's extended power outage in Boston's Back Bay also got me thinking about this. I have MySQL replication in place, and in a complete loss of master MySQL and no reliable backups to go to, this is a great last resort assuming the slave and master have communicated and are in sync. But the promotion of slave to master while the master is still available is iffy at best and I'm not well versed in MySQL to know what would happen and what pitfalls to avoid in that scenario, and setting it up sounds like just as much work as setting up replication the first time around, which doesn't actually decrease potential downtime (outside of being able to pull up a master quickly, but then you still are on the ticking clock time bomb of getting slaves up as quickly as possible just in case the new master decides to kick the bucket).
Anyway rambling on, failover has always been a rallying cry for admins to ask more for from JAMF especially if you just happen to still be running OS X and Apple hardware to run JSS. Maybe it's time for JAMF to start to talk about HA situations and best practices, and start to go through the pros and cons of the different approaches possible through virtualization, or for JSS to officially support MySQL Cluster installations (instead of just the single server use community / enterprise editions)
Just thought I would chime in here, since I have had the opportunity to work with some larger customers that have larger global infrastructures. There are many ways to set this up, but I will sort of just go along the basic lines of the idea one might use to have a redundant infrastructure in place. This may also be outside of company/organization budgets, outside of security policy, or so forth. So, this is just a basic idea.
How it works, is that you place two load balancers in your infrastructure. The first load balancer will point to your Tomcat cluster. Each Tomcat cluster will have settings for priority, non priority and database connections. After the client hits the first load balancer and the Tomcat cluster, it will then hit the second load balancer, which is only in place for failover. It will point to your MySQL Master server. In the case the database load balancer cannot hit the Master DB server, it will then switch over to the replica server instead.
So, essentially this is how it would look, in a really crude sort of layout.
Client > Load Balancer > Tomcat Clusters > Load Balancer > DB Master Server (replica if Master is not available)
There are some things to consider. MySQL replication, operates by replicating the binary logs in MySQL. It is not a real time replication. There are times and events that may cause replication to lag. Like during an upgrade, or a heavy load of policies or inventory updates that are executing at the time. So, there is a chance that when the DB load balancer kicks over to the replica DB during a failure, the production database and the replica database could be slightly out of sync.
Anything that is high priority on the client side can possibly be cached and ran in "offline mode." Also take into consideration your configuration you are going to use with this sort of setup. Load balancers need to be configured, certs need to be in place, the proper amount of connections need to be configured in both MySQL and Tomcat, as well as many other things. These settings, along with the hardware configuration should reflect your environment, and everyone's environment is different.
I have just put out the very basic of ideas when doing this there is a lot more to it, so please only take this as a basic idea. A lot of planning is involved when setting up such an infrastructure, and it will probably be tailored to your exact needs.
Hope this helps.
Is MySQL Community / Enterprise supported by a load balancer? Talking to my JAMF reps they suggested against this approach of true failover / clustering of the database. I also thought that's why Oracle makes MySQL Cluster, but JSS officially doesn't support that either.
It's one thing for it to work. It's another when it breaks and there's no one to turn to for official support.
Yes, I have worked with a few customers that do put a second load balancer in front of the database for fail over. Basically, if the Master MySQL database fails to respond the load balancer switches automatically to the replica database. There are caveats in this setup, like if replication fails or is lagged and a fail over switch has been initiated, the replica database is not as up-to-date as the master was. Since there are a lot of factors in place you must make sure all your pieces work and configurations are properly setup for your needs. Since every environment is different, each setup I have seen is pretty much tailored to that organizations needs.
All the suggestions and comments in this thread are really good ones. I would urge anyone here to create feature requests on JAMF Nation so our dev team can see exactly what is needed to help solve your complete fail over infrastructure needs.
I look forward to hearing how some of you accomplish these goals.
Just several months ago we moved from a Mac Pro distribution point to a Synology NAS one, since we needed the disk space and the mac pro are up for renewal and we were no longer able to purchase more due to them be decommissioned in the EU.
Once I read up on the Synology servers I found they offered HA as you can see here http://www.synology.com/dsm/business_management_server_resiliency.php?lang=uk
we have 1 Mac Mini running the tomcat but the MySQL sits on the Synology server
The white paper in the following link can offer a lot more detail on how it works. but again as some of you have already mentioned, it all depends on the money and what your current infrastructure can support.
http://www.synology.com/dsm/highlight_sha.php?lang=us we are planning to use this since we have 2 data centres in 2 different buildings and were required to build resilience into the system. we only purchased 1 synology the 10613xs+ to see how it fairs for us and it fairs very well and are soon to purchase another of the same spec and begin testing of the HA.
Will let you know more once we get the budget to buy another.