Unable to Create Mobile Account

kish_jayson
Contributor

While not specifically about JAMF Casper Suite, I was wondering if anyone else has encountered the following behavior from their managed clients running OS X Mavericks (10.9.5) which are bound to an Active Directory domain.

Intermittently, we will have users report that they suddenly are unable to log into their systems and receive the error "Unable to Create Mobile Account". Typically when this occurs, their mobile account will no longer appear in the Users & Groups pane of System Preferences, though their actual user directory remains in /Macintosh HD/Users/.

We've been able to remediate the issue for every occurrence thus far, but we don't have any real explanation as to what's causing the issue in the first place.

Has anyone else experienced this with their managed clients, and if so, were you able to determine a rhyme or reason as to why it happens?

27 REPLIES 27

bainter
Contributor

Are you using the Apple AD plugin or a third-party solution to join to AD? We use Centrify and we ran into spotty mobile account creations early in the rollout. We have network home folders and the name of that folder must match the login name for starters.

bentoms
Release Candidate Programs Tester

@kish.jayson we use the inbuilt AD plugin & don't see this, but we did.

IIRC, we unchecked "Use UNC path" under SMB home.

Can you post your settings?

pblake
Contributor III

Have you verified the computer object is good in AD? When I have seen that occur it is usually the computer object in AD got moved or deleted or corrupted.

A rebind would fix the issue.

kish_jayson
Contributor

@bentoms I have attached a screenshot of our settings below using the built in Active Directory plugin.

@pblake That's the strange part. When the issue occurs, another network user (whom had not previously logged into that machine) is able to log in without issues and create a mobile account. This only appears to affect that particular user on that particular system.b945e434d7f4406c9f87c29b5470e0cb

pblake
Contributor III

@kish.jayson

Can you post the logs of one of the machines this happened to? Perhaps also identify the approximate time period this happened for the machine.

For something like this to happen suddenly, it sounds to me like OS state change. By that meaning, something was changed, installed, or updated on the machine and it corrupted something in the netinfo manager, ruining the existing user and profile on the machine. That's my working theory, and I'm sticking to it ;)

charles_hitch
Contributor II

We had this happen and worked with Apple to find the sqlindex files in the local directory services cache were corrupt. Remove them then reboot the computer. At reboot the sqlindex files will rebuild themselves automatically. The follow of course must be run as root, you can do it using a policy since the system will still checkin.

#!/bin/sh
DSLOCAL_DIR="/var/db/dslocal/nodes/Default"

# Remove problem files
echo "Removing problem files"
[ -f "$DSLOCAL_DIR/sqlindex" ] && rm -f "$DSLOCAL_DIR/sqlindex"
[ -f "$DSLOCAL_DIR/sqlindex-shm" ] && rm -f "$DSLOCAL_DIR/sqlindex-shm"
[ -f "$DSLOCAL_DIR/sqlindex-wal" ] && rm -f "$DSLOCAL_DIR/sqlindex-wal"

# Reboot required
echo "Files removed, restarting computer..."
shutdown -r now

kish_jayson
Contributor

@pblake I'm unable to post the entire system log due to it's size, but I am willing to email it to you if you'd like. However, I was able to find the following entries that may be helpful.

Mar 20 16:31:05 wshm4000073.local SecurityAgent[180]: Could not get the user record from OpenDirectory.
Mar 20 16:31:05 wshm4000073.local SecurityAgent[180]: Will sleep 3 seconds and try again (retryCount = 6)
Mar 20 16:31:08 wshm4000073 kernel[0]: utun_ctl_connect: creating interface utun0
Mar 20 16:31:08 wshm4000073 kernel[0]: SIOCPROTODETACH_IN6: utun0 error=6
Mar 20 16:31:08 wshm4000073.local parentalcontrolsd[271]: StartObservingFSEvents [849:] -- *** StartObservingFSEvents started event stream
Mar 20 16:31:10 wshm4000073.local SecurityAgent[180]: User info context values set for dks0313949
Mar 20 16:31:11 wshm4000073.local loginwindow[58]: ERROR | -[ScreenSaverDaemon restartForUser:] | - [ScreenSaverDaemon restartForUser:]: userName == NULL
Mar 20 16:31:11 wshm4000073.local ManagedClient[178]: MCXCCacheGraph(localhost, dsRecTypeStandard:Computers): The record "localhost" (dsRecTypeStandard:Computers) interferes with the computer cache. Delete this record to resume caching.
Mar 20 16:31:11 wshm4000073.local ManagedClient[178]: MCX.getComputerInfoFromStartup: MCXCCacheGraph() == -2 (MCXCCacheGraph(localhost, dsRecTypeStandard:Computers): The record "localhost" (dsRecTypeStandard:Computers) interferes with the computer cache. Delete this record to resume caching.)
Mar 20 16:31:12 wshm4000073.local mds_stores[122]: (/.Spotlight-V100/Store-V2/14239887-BB3C-407A-9393-F355403403D6)(Error) IndexCI in fd_ptr open_index_file(int, char *, int, off_t, off_t, _Bool, void **, int, int *):bad file size: 0, min size 4320, live.3.indexIds
Mar 20 16:31:16 --- last message repeated 1 time ---
Mar 20 16:31:16 wshm4000073.local mds_stores[122]: (/private/var/folders/zz/zyxvpxvq6csfxvn_n0000000000000/T)(Error) IndexSDB in int page_cache_deserialize_entries(db_cache_t, int, off_t *, reheat_block_t, page_cache_reheat_block_t):Unexpected EOF in page cache preload; got 0 bytes at offset 0
Mar 20 16:31:19 wshm4000073.local logind[59]: -[SessionManager getClient:withRole:inAuditSession:]:241: ERROR: No session dictionary for audit session 100000
Mar 20 16:31:19 wshm4000073.local logind[59]: _SMGetSessionAgent:73: ERROR: __SMGetClientForAuditSessionAgent failed 2
Mar 20 16:31:19 wshm4000073.local ManagedClient[845]: MCXCCacheGraph(localhost, dsRecTypeStandard:Computers): The record "localhost" (dsRecTypeStandard:Computers) interferes with the computer cache. Delete this record to resume caching.
Mar 20 16:31:19 wshm4000073.local ManagedClient[845]: MCX.getComputerInfoFromStartup: MCXCCacheGraph() == -2 (MCXCCacheGraph(localhost, dsRecTypeStandard:Computers): The record "localhost" (dsRecTypeStandard:Computers) interferes with the computer cache. Delete this record to resume caching.)
Mar 20 16:31:34 wshm4000073.local ManagedClient[178]: MCXCCacheMCXRecordAndGraph(): [localNode createRecordWithRecordType:dsRecTypeStandard:Users name:"dks0313949"] == 4102 (Could not create the record because one already exists with the same name.)
Mar 20 16:31:34 wshm4000073.local ManagedClient[178]: MCXCCreateMobileAccount(): Failed to create account. Error = 4102 (MCXCCacheMCXRecordAndGraph failed). Cleaning up mobile account record.
Mar 20 16:31:34 wshm4000073.local ManagedClient[178]: MCX.createMobileUserAccount: MCXCCreateMobileUserAccount( dks0313949, /Users/dks0313949 ) == 4102 (Could not create the record because one already exists with the same name.)

@charles.hitch Was this script something that you'd only run on the affected machine when the issue occurred, or something you'd run as preventative maintenance? I'm curious to know more about exactly what it's doing and how it was able to resolve your issue.

charles_hitch
Contributor II

@kish.jayson We would run this only on affected machines. I would not recommend doing this on properly functioning machines. The way it was explained to me is the sqlindex files are binary database builds of the plist files that exist within the dslocal directory. This speeds up booting and login because databases are faster to read than all those plist files. This databases can become corrupt, though I was not given an explanation as to how, which prevents login and updating of these files. So removing them triggers a process to rebuild them at the next OS boot based on the plist files that exist. We have used this many times and had no issues.

Do you use McAfee Anti-virus?

kish_jayson
Contributor

@charles.hitch Sadly, we are using McAfee Endpoint Protection 2.2.0 for Mac, which is being managed by a central EPO server. As of right now, we are not excluding any directories from the On-Access Scanning.

mm2270
Legendary Contributor III
Do you use McAfee Anti-virus?

I was going to ask the same thing. Because we see this issue happen here on occasion as well. As far as I have seen, it is not related to bind issues, nor bind settings, time sync problems, corrupt databases or anything else. The Mac's these happen on are in perfect shape, communicating correctly with our AD servers. The computer account in AD is valid, as is the user's account. Something just messes up the local cached directory account and blows it away.
We use (much to my chagrin) McAfee Security here, and although I don't have direct evidence for this, have suspected McAfee for a while now. I say that only because the number of problems we've discovered were related to McAfee over the last 3 years is pretty long, so I wouldn't be the least bit surprised its getting its grubby hands on the user accounts in the directory service and screwing it up.

mm2270
Legendary Contributor III

@kish.jayson

As of right now, we are not excluding any directories from the On-Access Scanning.

Dude! You need to have a laundry list of exclusions for McAfee Endpoint Security not to routinely lock up or hose your Macs. I'm surprised your Macs are even working with no directory exclusions in place.

If you email me separately I will send you our exclusions list to bring back to your security people. I don't want to post them here lest I incur the wrath of our overzealous security people.
mm2270 [at] me [dot] com

kish_jayson
Contributor

@mm2270 I'll gladly take you up on that. We've only recently begun managing these systems through a central EPO server, so we haven't done much on the exclusion front. Ironically, it wasn't until recently that we began to see some performance impact with users running Android Studio.

kish_jayson
Contributor

@charles.hitch Were you thinking that McAfee Endpoint Protection was affecting /var/db/dslocal/ in this case?

miawri
New Contributor

We too are occasionally seeing this 'Unable to create mobile account' issue on our AD bound Macs. I've not been able to find any reason for it but we also run McAfee EPM with a few exclusions but none that target the /var/db/dslocal/ folders. When this happens in our environment, the actual user plist exists but is empty - I have filed a bug with Apple but they never came back with anything..

analog_kid
Contributor

I've seen something similar to this though not with a user but with the admin group. The admin.plist file would exist but would be a zero length file, spontaneously for no apparent reason. FYI, we also run McAfee in our environment. We are currently auditing exclusions (none for /var/db/dslocal/ currently).

mm2270
Legendary Contributor III

Yeah, same here. I didn't actually mention it here since there is another thread that discusses that particular problem, but we also occasionally see the local admin group disappear, at which point all local accounts save for root, lose their administrative access. root isn't affected because its not like other admin level accounts.

We can fix the issue by enabling root and using it get the admin group back, but until not too long ago we didn't associate this with McAfee. But at this point it appears that McAfee Endpoint Protection are the root issue for both the can't create mobile account issue as well as the admin group disappearing.

kish_jayson
Contributor

Ironically we had the same issue occur with the admin.plist last week. As a result, we've added /var/db/dslocal/ to the exclusions for McAfee Endpoint Protection. I'll report back in a few weeks to see if either of these issues occur again.

Bartoo
New Contributor III

We're seeing exactly the same thing. And we use McAfee Endpoint Protection as well.

When it's happened here, the users have reported that it's often after a complete Finder freeze and a hard reboot. In all of the cases the users have reported that they have been having really long boot times (+10 mins) stuck on the grey screen before they get to the log in window. Then they get the "Unable to Create Mobile Account". And when we look in /var/db/dslocal/nodes/Default/Users/ <affected_user>/plist is empty.

We have a case with Apple as well, they gave us a "fix" which at least gets the account up and running quickly but no cause, and we see 1-5 cases per week.

@mm2270 - I would really like to see your exception lists - ours are pretty short and we do not currently have an exception made for /var/db/dslocal/.

mm2270
Legendary Contributor III

@Bartoo Email me offline and I will send you our list. Its not huge by any stretch, and most of it relates to Outlook issues we identified when McAfee Endpoint is running, but also a few other items.
We also don't have the /var/db/dslocal exception in place yet, so I can't directly speak to whether it resolves some of the above issues. I suspect it might though.

donmontalvo
Esteemed Contributor II

@mm2270 Any chance you can pastbin the list?

--
https://donmontalvo.com

donmontalvo
Esteemed Contributor II

@charles.hitch your solution worked for one of our big clients, beer on me at the next JNUC. :) Slight mod to the script, since there are three sqlindex* files, and removed the reboot command so the policy can submit log to JSS.

#!/bin/sh
/bin/rm -f /var/db/dslocal/nodes/Default/sqlindex*
exit 0
--
https://donmontalvo.com

jamesgreenMatte
New Contributor II

Thank you for this script. We have now run into this a few times here and this script fixes the issue. Nothing else has.

donmontalvo
Esteemed Contributor II

Circling back to tie loose ends...if your Security team asks you to substantiate the need to exclude these...

Hi Sorry for being unclear, the files in question is a cache used by the login system they are restricted to the root user and is secure and non-executable. Regards XXXXXXXXX XXXXXXXXX@apple.com AppleCare Enterprise Support

PS, kudos to @charles.hitch for the script, all variations came from his script.

Don

--
https://donmontalvo.com

hcgtexas
New Contributor III

Checking in from 2017. solution still works! Thanks a bunch guys.

BK
New Contributor III

@Bartoo Happening in our environment as well. What was the 'fix' the Apple Provided? We do not currently have an exception made for /var/db/dslocal/ for McAfee as well. We are running 10.12.6. Just seen two machine in the last 2 months with this issue. Thanks!

adroitboy
New Contributor III

Worked for me too on a machine that had issues after system update being applied in 12.12.6. The user reported that they couldn't login. I could with other users and noticed there was no mobile account showing in dscl. Intesting I got an error when attempting to create a local account via createmobilehome.

sh-3.2# createmobilehome -v -n user_name 
The mobile account could not be created: 4102 (Could not create the record because one already exists with the same name.)

Interestingly, while the account didn't show in dscl, it did show when listing /var/db/dslocal/nodes/Default/Users. I deleted the above sql files, restarted and was immediately able to create the mobile home in terminal. Thanks!

Hayden_Webb
New Contributor III

I went into the Users folder (/Computer/Macintosh HD/Users/), deleted the user folder that was having the error and it fixed the issue for me.