Migrating large amounts of data

Kedgar
Contributor

Hello,

I wanted to know what you suggest using, or what you have used in the past
to migrate production data from one storage array to another while being
certain to keep file integrity such as forked data, extended attributes,
create/modify times, and other metadata. I have multiple arrays I'm
replacing (old xserve raids) and I want to make sure I copy stuff over as
quickly as possible while keeping the file integrity. I'm not much
concerned about permissions as I'm planning to whack those anyways.

In the past I used cp and ditto... now I'm thinking of using rsync. Not
sure if ASR would be a good idea here or not.

Thanks,
Ken

7 REPLIES 7

Not applicable

rsync is and has been my choice: it can be restarted and is durable, so if you have a huge amount of stuff to move that is being used, you can repeatedly rsync up to the cutover point without losing anything and without having a very long lockout window. Over ssh it can be bandwidth throttled, so you don't saturate your network during working hours if you're copying from an old machine to a new one. I also use rsync as a somewhat limited but usable incremental backup scheme.

--Jim

Kedgar
Contributor

Thanks Jim, that's what I'm planning on using so far. I have compiled rsync
3.0.7 and packaged itŠ It is being installed as part of my post install on
all my new servers.

Thanks,
Ken Edgar

tlarkin
Honored Contributor

There has been some discussion about this on mac forums. If you want a
different version you can always install macports of fink and get the
newest build.

Not applicable

Just a FYI, I'm not showing v3 on any machines by default. I see 2.6.9 on both client and server. I checked via ARD on several machines that were recently imaged and running 10.6.4 or 10.6.6.. All had rsync 2.6.9. I'm running v7.31 of the suite.

I'm wondering if the update to rsync 3.0.7 was inspired by v3's better ability to handle metadata? See http://www.bombich.com/rsync.html

donmontalvo
Esteemed Contributor III

Yep, you were right. I just removed MacPorts and my MBP now has the same version as the server (the default):

$ which rsync /usr/bin/rsync $ rsync --version rsync version 2.6.9 protocol version 29

Thanks,
Don

--
https://donmontalvo.com

donmontalvo
Esteemed Contributor III

Hi Edgar,

Wow, sorry for jumping in to this thread, it's very interesting. Just noticed Snow Leopard Server has rsync 2.6.9 installed:

$ rsync --version rsync version 2.6.9 protocol version 29

And Snow Leopard client has rsync 3.0.6 installed:

$ rsync --version rsync version 3.0.6 protocol version 30

I know rsync is used by JSS and it's used by Mac OS X for Time Machine. I wonder why there's such a wide variation in version. I remember Dan Shoop and Wilfredo Sanchez wrote some good papers on rsync a while back, but what kinds of issues are there today that would lead you to update rsync on the server? Sounds like a good idea, just wondering what drove the decision,

Don

--
https://donmontalvo.com

Kedgar
Contributor

Hi Don,

I have been reading up on a project called backup bouncer. It is sort of a
test of any backup software you can throw at it. It rates the ability of
your backup software/utility to restore Macintosh and *nix specific data.
Backup bouncer rated the Apple supplied version of rsync poorly as it could
not accurately copy certain attributes, links, and such. There is a page on
Bombich.com regarding how to compile rsync and install it to pass the backup
bouncer tests. This is what I have done, and so far I have seen good
results.

I actually packaged rsync after this and deployed it to my new Xserves.

http://www.n8gray.org/blog/2007/04/27/introducing-backup-bouncer/

http://www.bombich.com/rsync.html