Storage/unresponsive issue with xserve

Kedgar
Contributor

Hello, I know this is not related to JAMF at all, but I know there are a lot
of very talented individuals here with a lot of Apple experience. I was
wondering if any of you have run into a similar issue, and if you have; how
were you able to resolve it?

We have the latest Apple Xserves (2 of them) at a remote site. They are
attached to a Promise Vtrak E610f with a J-class expansion via two Qlogic
sanbox 5602 switches. Each server has it's own luns through lun masking on
the array. We have been running this configuration without issue for about
six months. Just last week we began to see the servers stop accepting
connections (could ping, but couldn't open any new sockets). If I had an
existing ssh or vnc connection to the server I could still see it.

It is usually one server that starts having these symptomsŠ then maybe 30
minutes later the other server will exhibit this too. Wile on the servers I
notice that the mounts to the vtrak seem to be there, but you cannot follow
them. Finder usually starts hanging and you can not gracefully shut it down
or restart it. I have had to issue halt commands to kill the system and
then have someone physically power off and on the systems. They will run
clean for a day or two, then exhibit the same symptoms.

I have tickets open currently with Qlogic, Promise, and Apple. We have
resolved a few issues, but it has not fixed what we are seeing yet. We have
a fielsystem error on one of the fibre switches that doesn't seem to be
affecting this issue. I'm glad that these are the only Qlogic switches we
have as QLogic will not provide an RMA with their standard warranty. They
want us to send a switch to them for repair which would come back in 5 to 7
weeksŠ unacceptable!

Thank you,
Ken Edgar

3 REPLIES 3

nessts
Valued Contributor II

are the servers bound to AD? that was a huge issue for me over the last year, apple has provided me with a directory service update, I am still at 10.7 running this update and refuse to go forward because I have not had the issue in a couple of months in this configuration.
Also, I heard 10.6.8 was causing some performance issues, but not sure if they match your problems.

--
Todd Ness
Technology Consultant/Non-Windows Services
Americas Regional Delivery Engineering
HP Enterprise Services

Matt
Valued Contributor

I have left all our servers at 10.6.7 due to 10.6.8 causing all sorts of performance and reliability issues. Our 10.7 test servers so far are doing great.

--
Matt Lee, CCA/ACMT/ACPT/ACDT
Senior IT Analyst / Desktop Architecture Team / Apple S.M.E / JAMF Casper Administrator
Fox Networks Group

Kedgar
Contributor

We are on 10.6.7. I have not upgraded to 10.6.8 because of the problems I've
heard. We are running with AD integration. No OD.

Sent from Ken's iPhone

On Aug 15, 2011, at 3:09 PM, Matthew Lee <Matt.Lee at fox.com> wrote:

I have left all our servers at 10.6.7 due to 10.6.8 causing all sorts of
performance and reliability issues. Our 10.7 test servers so far are doing
great.

--
*Matt Lee, CCA/ACMT/ACPT/ACDT*
Senior IT Analyst / Desktop Architecture Team / Apple S.M.E / JAMF Casper
Administrator
Fox Networks Group
matt.lee at fox.com