Error 401 after enabling cert based communication

CasperSally
Valued Contributor II

Every time we've tried enabling certificate based communication, we've seen issues where a subset of our machines stop talking to the JSS until we manually re-enrolling them, so we always end up having to turn it off shortly after turning it on.

About 2 months ago, I turned cert based off and sent out a jamf enroll policy per jamfs suggestion in the hope I could enable cert based communication without issue before we started our mass reimaging this summer.

We are ready to start imaging, so I re-enabled cert based communication, but unfortunately, I'm seeing issues with some freshly imaged machines that aren't talking to the JSS post image (so our post image scripts that run manual triggers don't run). So far it's happened on 2 out of 10 test reimages, and we have over 5000 machines to image in the next few weeks so manually enrolling or doing quick add isn't something we want to do.

On a machine with the issue -
checkJSSConnection completes fine
recon produces 401 error
enroll gets machine back into working state, but at this point the post image scripts haven't ran and machine needs to be reimaged anyway

Seems similar to this thread
http://jamfnation.jamfsoftware.com/discussion.html?id=3888

Anyone have any suggestions? I am talking with JAMF too but am under a time crunch now.

9 REPLIES 9

bentoms
Release Candidate Programs Tester

Is it a Publicly or Privately signed cert?

CasperSally
Valued Contributor II

We use the JSS Built-in CA

kuwaharg
New Contributor III

On the computers that have issues, have you examined the JSS certificate in the keychain utility? Does it trust the cert?

CasperSally
Valued Contributor II

Cert in keychain access on working and non working machine look identical to me.

justinrummel
Contributor III

Is there any consistency for "subset of our machines stop talking" such as geographic area and/or IP/subnet ranges? Wondering if you have a firewall or router that is not passing traffic correctly.

CasperSally
Valued Contributor II

Nope. These 10 test reimages were all done in my test area on the same subnet, using the same 3 ports. No consistency on one port always producing the failed machines. Same port will work fine on next reimage.

ImAMacGuy
Valued Contributor II

i turned this (briefly) on last year too, it caused about 93 systems to stop checking in. I never was able to re-enroll all of them back on to the DB and eventually wrote the systems off. Never messed with it since.

bentoms
Release Candidate Programs Tester

@CasperSally.

Is the time correct on those macs?

CasperSally
Valued Contributor II

@bentoms - yup.

I have 6 test machines here, 3 had the issue, all different models. Once issue is resolved (talking to JSS), if I reimage same machine it seems resolved for good (I'm reimaging those same laptops over and over).

So far I have one I left in failed state, the other 2 I fixed 1 by manually enrolling and other by just reimaging it and not changing anything.

I am not sure if I dislike Netboot or Cert based stuff more at this point.