Unison: Backup & synchronize files

On this page…

    This is a part of a chapter from a book that never saw the light of day, since the publisher killed it before I could finish. Some of this info is out of date, but I thought I'd give it to the Net in case someone found it useful.

    The materials on this page are under a Attribution-ShareAlike Creative Commons license.


    If a user new to Linux starts asking more experienced users about a good way to backup her data, she will soon hear about a wonderful tool named rsync. Rsync was developed by Andrew Tridgell, the same man behind Samba; in fact, Tridge has stated that he believes that he'll be remembered through posterity for rsync far more than for Samba, and he just may be right. However, rsync, while absolutely excellent, is not the best software to use for backup.

    Before I explain further, it's important to understand what rsync is and how it works, because that will help me justify my statement above. Rsync is software that performs a one-way synchronization. Here's an example: let's say you want to backup your home directory—at /home/username—to a server on your network. The server is mounted, using Samba, to /mnt/server. You configure rsync so that /home/username is your source directory, and /mnt/server is your destination. You run rsync, and soon (depending on the amount of data you're copying), your home directory is now safely copied to your server.

    The next day, after working on several different files throughout the day, you wish to repeat the process. You run the same rsync command, but this time the process is over in a fraction of the time it took the day before. Why? Because this time, rsync only copies over any bytes that have changed. Think about it: you originally copied over a large document that was 10 MB. Today, you opened that document and made a slight correction. When you run rsync today, it doesn't copy over 10 MB; instead, it copies over 1 kb, just the parts that you changed. That's a humongous difference, and it's the main reason that rsync is so darn efficient.

    Rsync sounds pretty awesome, and it really is worth your time to check it out. However, I don't use rsync for my day-to-day backups. Instead, I use software named Unison, which uses rsync as its base. However, Unison goes beyond rsync in two important ways: Unison runs natively on both Windows and Linux, which increases its usefulness, and Unison synchronizes in two directions at one time, while rsync only goes in one direction.

    Let me give another example, from my own situation. At home I have a desktop machine, but I also use a laptop. I work a lot outside of my home, so my laptop travels with me constantly; when I'm home, though, I sometimes leave my laptop in my backpack and use my desktop instead. I have a large amount of data that I need to have available to me at all times: Web pages I've read, instruction manuals, articles I'm working on, photos, and so on. All told, it's about 10 GB worth of stuff. I keep all of this work on my desktop,and I keep a mirror of the same thing on my laptop.

    Obviously, I need to keep all 10 GB in sync between my two machines. If I delete a file on the desktop, I need to delete it on the laptop. If I change a file on the laptop, it wouldn't do to open the original, unchanged file on the desktop and make changes there. That way lies madness.

    So why can't I use rsync? Because my goal is to synchronize between two machines, each of which acts as a source and a destination at the same time. Here's one example: on my laptop, I move the file "widgets" from directory "foo" to the directory "bar." On the desktop, however, "widgets" is still in "foo." If I run rsync with the laptop as the source and the desktop as the destination, so far so good: "widgets" is now in "bar" on both the laptop and the desktop. However, "widgets" is still in "foo" on the desktop. The next time I run rsync with the desktop as the source and the laptop as the destination, "widgets" gets copied over from "foo" on the desktop back into "foo" on the laptop, leaving me with "widgets" in both "foo" and "bar," which produces a real mess.

    Of course, I could run rsync with the "delete" flag set, so that files deleted on the source are also deleted on the destination, but that can be a very dangerous practice. What if I delete "foo" on the laptop but change "foo" on the desktop? If I run rsync with the laptop as source, then "foo" is deleted on the desktop, which is not what I want. Instead, I have to remember to run rsync with the desktop as source so that the changed "foo" is copied over to the laptop. But what if there were files I changed on the laptop and deleted on the desktop? Then I need to run rsync with the laptop as source … and on and on, ad infinitum, back and forth.

    Unison solves this problem. You define a source and a destination, and then you run Unison. After some length of time, Unison starts asking you questions about the files it has found: copy this file from laptop to desktop? Copy this other file from desktop to laptop? Delete this file on the laptop since it was deleted on the desktop? You can accept Unison's guesses, or specify a direction for the copy to go, or even tell Unison to skip the file entirely until another day. With Unison, synchronizing two directories, each on a different machine, becomes a far simpler task.

    Unison has another trick up its sleeve that can come in handy. My laptop is configured differently than my desktop, so I don't want to synchronize my KDE config files on both machines. Instead, I just want to backup my laptop's KDE config files to the desktop, and I always want the copy to go one way only: from laptop to desktop. Unison does dual  synchronizations beautifully, but it will also do one-way syncs as well, just like rsync. Unison thereby acts like an "rsync+": designed to do two-way, but also capable of one-way as needed. If you're interested in backup for your files, you owe it to yourself to check out and learn Unison.

    Installation & configuration

    Unison is easy to install; the tricky part comes when you need to configure it. Even then, Unison's basic configuration is a piece of cake. It's defining exactly what you want to synchronize, and how, that will take up most of your time. It's not super-hard to create those definitions, but you definitely need to read carefully and test your setup before you start using Unison with your essential data.

    If you're using a Debian-based distribution, then installing Unison is easy as pie. Just run "apt-get install unison unison-gtk" and you're done: you'll have both the basic, command line-only Unison program and the GTK-based GUI. If you're using SUSE, and you've installed APT, then you should be able to install Unison and its GUI easily as well. To my knowledge, this is not the case with Red Hat and Fedora.

    If you're not using a Debian-based distro, then you'll need to download the files you need from the official Unison Web site, at http://www.cis.upenn.edu/~bcpierce/unison/. Once you're there, click on the "Download" link at the top of the page. On the Download page, the developer asks you a couple of pretty innocuous questions. If you'd like to fill them in, do so, but you don't have to. Scroll down to the bottom of the page, and click on the appropriate button to download Unison. I recommend choosing Download latest stable version, but if you're feeling particularly masochistic, go ahead and choose a different version. Just don't come crying to me when you have problems.

    After pressing the button, your Web browser will display a list of files. To download the files you need, right-click and choose "Save Link to Disk" or somesuch language, and save the files to your hard drive. As of this writing, the latest stable version of Unison is 2.9.1, so you should get the following:

    All told, it's about a 2.5 MB download. After you get the files on your machine, you have a small decision to make: who's going to be running Unison? If you're the only person who uses your computer, or you're the only person on your computer who will use Unison, then you can just make these files executable, put them where they can be found by your path, and you're ready to roll. If you're going to allow several people to run Unison, then you need to put its files in a place where everyone can get to them, which will require root access.

    If you're going to run Unison on your own machine, or you want others who use the machine to have access to Unison, then here's the easiest thing to do. As root, cd into the directory containing the files you downloaded, and type the following:

    chown root.root
    chmod 755 unison.linux*
    mv unison.linux-gtkui /usr/bin/unison-gtk
    mv unison.linux-textui /usr/bin/unison

    You just made root the owner of the two Unison programs, made them executable for all users, and moved them into the /usr/bin directory, which is definitely in your path.

    If you don't have root access, then cd into the directory containing the downloaded Unison files and run the following (you can skip the second step if you already have a bin subdirectory in your home directory):

    chmod 744 unison.linux*
    mkdir -p /home/[username]/bin
    mv unison.linux* /home/[username]/bin

    You just made the Unison files executable, created a bin subdirectory in your home directory, and moved the Unison programs into that subdirectory, which should be in your path already.

    As for the Unison manual you downloaded, place that where you can get to it as needed. It's very thorough and well worth the time you give to it.

    If you'd like to make it easy to get to Unison from the K menu, use the KDE Menu Editor to place a link to Unison on the K menu. Otherwise, you can just start Unison from the command line.

    You've installed Unison. Now let's try it out.

    Introductory example: synchronize files on one machine

    Let's start with a simple example: you want to synchronize two folders on the same machine (these could just as easily be two folders, one on your machine and one on another machine). Let's call these folders "poetryA" and "poetryB," but remember, our goal is to make sure that they hold the same files and subfolders. In poetryA, let's create the following subdirectories:

    mkdir Milton
    mkdir Hopkins
    mkdir Hardy
    mkdir Personal

    Inside the "Milton" directory, let's create the following:

    mkdir Sonnets
    mkdir Epics

    Finally, let's add some files into the appropriate directories:

    cp XV_On_the_Late_Massacre.txt poetryA/Milton/Sonnets/.
    cp XVI_When_I_consider.txt poetryA/Milton/Sonnets/.
    cp Paradise_Lost.txt poetryA/Milton/Epics/.
    cp The_Windhover.txt poetryA/Hopkins/.
    cp The_Darkling_Thrush.txt poetryA/Hardy/.
    cp Hap.txt poetryA/Hardy/.
    cp After_Irma.txt poetryA/Personal/.
    cp Her_Studies.txt poetryA/Personal/.
    cp Anniversary_1.txt poetryA/Personal/.

    Now let's set up Unison and get it running for the first time. Open Unison, and you're presented with a list of "profiles": information about a specific set of files and folders that you wish to synchronize. In your case, you shouldn't have any, so we're going to have to create one.

    Unison's list of profiles.

    Since we don't have any to start with, go ahead and press Create new profile to get the process started.

    Give your new Unison profile a useful name.

    Since we're synchronizing a folder of poetry, let's be really clever and give this profile the name of … you guessed it, Poetry! Press OK, and you are next asked to name your first root. Huh?

    Identify the first directory that you wish to synchronize using Unison.

    A "root," in Unison-speak, is the directory that you wish to synchronize. When you choose a root, you are telling Unison that you wish to sync the directory and all of its contents—files and subdirectories—unless otherwise stated. To make the process easy, just click on the Browse button and navigate to the poetryA folder. In my case, it's on my desktop. Once you've made your choice, press OK to close the "Select a local directory" window and then press Continue in the "Root selection" window to move on to the next step.

    Now it's time to select your second root, the directory that you want to synchronize with the first root.

    Identify the second directory that you want to synchronize using Unison.

    Choose the second root the same way you chose the first one, by clicking on Browse and following the logical steps. Notice, however, that you must make an additional choice in this window: the type of method you will you use to communicate between the two roots. In our case, since the two directories are on the same machine, we'll use Local. Make sure that it is chosen, and press Continue.

    We'll discuss the other methods shortly.

    We are next presented with a warning, which, if you read it, actually makes some sense.

    Unison's warning the first time it is run.

    This is the first time we've ever run Unison, so it has to take some time and figure out just what files and folders it's going to have to deal with. Unison is warning you that it will now have to go through both roots and build a list of all the files it finds. This list, called an "archive" by Unison, will be used in the future when you wish to perform another synchronization. If you have a lot of content in your roots, this process can take quite a while—in some cases, hours. Since we have only a few poems in our "poetryA" folder, it's going to take only seconds. But be prepared to go make yourself a cup of coffee or spend some time reading email if you have thousands of files.

    Press OK to acknowledge Unison's warning, and shortly afterward you should see Unison's comparison of "poetryA" and "poetryB."

    Comparing two roots using Unison.

    If you look at Unison's window, it should make sense. Unison has found four directories in "poetryA" and none in "poetryB." The green arrows pointing right, toward "poetryB," indicate that Unison is going to copy files and folders in that direction. Since that is what we want, press the Go button and watch as Unison does its work.

    If you're not sure about a particular file or directory, click once on the line for it so it's selected, and then view the message in the bottom pane of the window. In this case, Unison is telling us there is a directory named "Hardy" in "poetryA" that is missing in "poetryB." In addition, "Hardy" was last modified on 18 March 2004 around 5 pm, and has the usual permissions.

    At the end of the process, Unison shows us its results.

    Results of a synchronization using Unison.

    As you can see, Unison has now placed green checkmarks in the Status column to indicate a successful synchronization. If you were to look in the "poetryB" folder now, you'd see the same folders and files that are in "poetryA," proof that Unison did its job. Things are going well …

    … so let's complicate things. What if I do the following in the "poetryA" folder:

    And then I do this in "poetryB":

    Boy, I've really messed things up. I had two folders perfectly in sync, and now I've gone and deleted some things in one place, changed things in other places, and added things in still other places. Worse, I've altered the same file, but in different ways in different folders. Ay yi yi. If I were going to sort this out manually, it would take some time, even with the small number of files I'm using in this example. Worse, it would be tedious, and there is nothing worse than a tedious task. Unison to the rescue!

    Open Unison, and notice that Poetry is listed as a profile. Select it and then click on OK to start comparing the two roots.

    Comparing two changed roots with Unison.

    Unison does a great job indicating the changes that I've made to these two roots. As I look over Unison, I can now start to make some decisions about what I want to do to keep my files synchronized. Let's start with the easy ones. I definitely want to copy "The Convergence of the Twain" from the Hardy folder in poetryA to the other Hardy folder in poetryB. I also want to copy the changes I've made to the two Milton sonnets from poetryA to poetryB.

    But what about Hopkins? I accidentally deleted it in poetryB, so I don't want to replicate that change in poetryA; in fact, I want to restore Hopkins back in poetryB (if I did want to delete the Hopkins folder in poetryA, I would do nothing, and Unison would remove it there as well, as the previous screenshot makes clear). To reverse Unison's direction, so that changes go from poetryA to poetryB instead of the other way around, I click once in the Hopkins line, and then choose the Actions menu, which gives me several choices.

    The Actions menu in Unison.

    In this case, I want to choose Propagate this path left to right, since I want the Hopkins folder to be copied from poetryA to poetryB. Notice that I could have also pressed the > key if I wanted to use the keyboard in the future. If, after making my decision, I wanted to go back and delete Hopkins from poetryA as well, I could either choose Propagate this path right to left(or press the < key) or Revert to Unison's recommendations, which would take me back to where I started for the selected folder.

    If I want to skip changing this folder, perhaps to think about it or to investigate further, I could choose Do not propagate changes to this path(or press the / key). The other options in this menu are pretty self-explanatory. Just be careful and think about the choices you make, as you can do some major damage if you're not on your toes.

    After deciding to Propagate this path left to right, Unison changes the color of the arrow from green, which indicates Unison's recommended course of action, to blue, which indicates a change made by the user that overrides Unison's recommendation—a small but helpful user interface widget.

    I have one last choice to make, and this is the most difficult of all. I made changes to Her Studies in both poetryA and poetryB, which has caused confusion in Unison, indicated by the red question mark. Unison just doesn't know what to do, so it can't make a recommendation.

    Back to the Actions menu. This time, the bottom two choices are not grayed out, as they were previously. I can either Show diffs or Merge. Let's start by asking Unison to Show diffs(or press the D key), which invokes the command line program diff and displays the results.

    Show diffs only works with text files. It won't work with binary files, including images and OpenOffice.org documents.

    Unison shows the differences between two files.

    The results look a little cryptic, so let's try to decipher them, starting with the second line.

    < Four Ways of Looking at Jeanne
    < Valentines Day, 1992
    < Her studies

    These four lines are in the Her_studies.txt found in poetryA but not in the same file found in poetryB, as the < character is supposed to indicate. The —- separates the two differing hunks of text.

    > Four: Her studies

    This line is in the Her_studies.txt found in poetryB but not in the file in poetryA, as the > character shows.

    "1,4c1" indicates the course of action that diff would undertake, if we were running diff outside of Unison: replace lines 1 through 4 in the first file with line 1 of the second file. "1,4" actually means 1 through 4 (yes, I know that using a comma instead of a hyphen is confusing; take it up with the creators of diff), and "c" stands for "combined add and delete," which is what diff would be doing: deleting lines 1-4 of the first file and then adding line 1 from the second file to it. However, even though we're using diff to display the differences between these two files, we're not going to use diff to actually merge the two files, so we really don't need to worry about this first line at all.

    Once we've looked over the differences between the two files, press Dismiss to close the window. Now it's time to somehow combine the differences between the two copies of "Her Studies." This seems obvious: just choose Merge from the Actions menu (or press the M key). However, when we do this, we get an error message: "Preference ╢merge2' must be set." Okey dokey—press Continue, and let's fix that.

    In order to explain how to solve this issue, I need to explain how Unison works. Remember that we created a Unison profile named Poetry that we've been using. The information about that profile—the roots we're using, any exceptions we want Unison to observe, and any particular preferences that we wish to set—are contained in a text file located in the .unison (note the dot) directory inside your home directory. In our case, we named our profile "poetry," so we would look for ~/.unison/poetry.prf." Open that file with your favorite text editor, and you'll see something like this:

    root = /home/rsgranne/Desktop/poetryA/
    root = /home/rsgranne/Desktop/poetryB/

    We need to set a preference in our Unison profile that tells Unison what program we want to use for merging two text files that have differences in content. To do this, simply add a new line to "poetry.prf" (make sure that Unison is closed first):

    merge2 = kompare CURRENT1 CURRENT2 ;

    For this line to work, you'll need Kompare installed on your system. If you're using KDE 3, you probably already have it. If you don't, it's pretty easy to get it. For Debian, a simple apt-get install kompare should work. For other distros, try apt-get install kdesdk-kompare.

    Save poetry.prf, close it, and then open Unison again. Select the Poetry profile and press OK. Repeat the previous steps with respect to Hardy, Hopkins, and Milton, and then select "Her_Studies.txt." We don't need to choose Show diffs this time, since we know what that is going to look like, so go ahead and choose Merge from the Actions menu, or press M on your keyboard. Kompare should open up, with the copy of Her Studies from poetryA on the left and the copy from poetryB on the right.

    Comparing the differences between two files with Kompare.

    Kompare is really a neat tool if you want to see the differences between two files. As you can see from the screenshot, Kompare has noticed that lines 1-4 in one file don't match line 1 in the other file, but everything else is the same. To fix things, we have to decide which change we want to accept—in other words, which file is the correct one—which will then overwrite the problematic line(s) in the other file.

    If you go to the Difference menu, you'll notice that you can choose Apply Difference, but that will apply changes in only one direction: from left to right. This is not what I want to do. I want the line on the right, in the poetryB folder, to overwrite the lines in the poetryA folder. To do this, I first need to switch the order in which Kompare is displaying the files by going to the File menu and choosing Swap Source with Destination, which switches the display of the two files in Kompare (but not the actual files themselves—each copy of Her Studies is still in the same folder as before; it's just the way they're shown in Kompare that's changed). Now I can go back to the Difference menu and choose Apply Difference, so that both files now say "Four: Her studies" at the top.

    If you don't want to use the menus, you can use the buttons on the toolbar, but only to Apply Difference. To change the order in which Kompare shows the files, you have to use the menu.

    Go ahead and save your changes by choosing File and then Save, or by pressing Control + S, or by pressing the Save button on the toolbar. Once you've saved your changes, close Kompare so that you're back at Unison.

    And now we see what I would classify as a mild annoyance. This may somehow be fixed in the future, but right now I get an error message from Unison, telling me that "Merge failed: Merge program did not create an output file," and giving me a Continue button to press. Unison expects that the program you use to merge your changes will leave an output file behind, since it otherwise has no way of knowing what we did in Kompare—after all, they're two completely separate programs. We simply passed the files that we told Unison that we wanted to merge to another application—Kompare—and then used that application to actually merge the files before going back to the original program—Unison. However, the fact is that we just unified the two files ourselves in Kompare. We don't really need Unison to take care of "Her_Studies.txt" any longer, since it is now the same in both folders, so go ahead and, making sure that "Her_Studies.txt" is chosen, select Do not propagate changes to this path from the Action menu (or press / on your keyboard, or press the Skip button at the bottom of the window). You're telling Unison to ignore this file during this go-round, which is just fine. If the file changes in the future in either folder, Unison will see that and let us know.

    Do not go to the Ignore menu and make a choice there, or Unison will forever ignore that file, now and into the future, forever. Notice that word "Permanently" in all the options? (You can change it, however, by editing the prf file for this profile, so don't despair)

    We're all set to go. Everything is the way we want it, so press the Go button at the bottom of Unison's window. In just a few seconds, Unison copies over all our changes, and green checkmarks appear in the Status column to let us know that successful alterations have been made. Cool.

    Synchronize files on two machines using SSH

    It's great to be able to synchronize files like we just saw, but both sets of files—or roots, as Unison calls them—were on the same machine. Backing up files on the same machine may happen sometimes; for instance, let's say you're working on a project, and you want to create quick backups just in case you need to rollback to a previous version of a file. More often, though, you're going to want to sync your files with another copy of those same files on another machine.

    At the beginning of this chapter, I mentioned that my backup method involves several machines, onto which I keep copies of all my files. So how can I extend Unison to cover roots that are located on more than one machine?

    Actually, if both machines are on the same LAN, this is shockingly easy. If you're using Samba to connect your computers, you already have mounts to other machines that, for all intents and purposes, might as well be on the machine you're currently using.

    Â For instance, let's say that I want to synchronize the chapters of this book on Homer, my laptop, and Dante, one of my desktop machines. As soon as I fire up my laptop when I'm at home, I run a simple mountscript that mounts my home directory on Dante to Homer, at /home/rsgranne/mnt/dante/home. Now I open Unison and create a new profile: "book_home" (the reason for the appended "_home" will become obvious in a moment). For the first root, I use /home/rsgranne/mydata/Linux_Marvels, and for the second, I use /home/rsgranne/mnt/dante/home/mydata/Linux_Marvels.

    One small detail that's easy to overlook: Unison must be installed on both machines. Make sure you do that before you try to connect!

    At this point, things are pretty much like I outlined in the previous section. Unison runs, looking for differences between the two roots. I can copy changes from Homer to Dante, or vice-versa. If I delete files, those deletions get updated as well. Since I'm typing this book in OpenOffice.org, I can't really combine differing files using the Merge command—that only works with text files, remember—so I'm careful to run Unison after I change a file on either machine, and before I change that same file on a different machine. All in all, it's just like synchronizing two roots on the same machine, except that one root is mounted using Samba (or NFS, if you'd rather use that).

    That's what I mean by "shockingly easy." But, as we all know, life—and computers—are rarely this easy. What if I'm out of my house working on Homer at a local coffee shop, and I want to sync Homer with Dante back at home? And what if I want to encrypt all that traffic, so that other writers in the highly competitive Linux book market don't grab my work and make improper use of it (that's a joke, by the way)? Once again, Unison to the rescue!

    I'm going to create another profile, this one called "book_away" (now you see why I titled the other one "book_home"?). My first root is still going to be /home/rsgranne/mydata/Linux_Marvels, but I'm going to do something different for my second root.

    Create a root in Unison that you connect to via SSH.

    Instead of Local, I've chosen SSH as my connection method. Of course, for any of this to work, I must have SSH installed and set up on my machines, but this is pretty standard with any Linux box nowadays. After choosing SSH, enter a domain or IP address as a Host, and a name in the textbox next to User, but only if the username on the current machine does not match the username which you'll use to log in on the other machine.

    And now the potentially confusing part: the Directory. You need to enter a path to the root's directory on the other machine to which you're connecting. You can use either an absolute path, like I did in the screenshot, or a relative path based on your home directory—mydata/Linux_Marvels, if I kept the directory structure I used previously.

    Once everything is entered correctly, press Continue. Unison will try to contact your second root using SSH, and in a few seconds, depending on the speed of your network connectivity, you should get prompted for a password. Enter the SSH password you use to connect to the computer you specified in Host, and if your connection is good and your password is correct, Unison will begin the process of comparing the two roots.

    Be patient during this process. Unison may appear to be locked up, but it's just taking its time to digest all the files and folders in the roots. The more data, the longer Unison will take.

    If something is wrong, Unison will report an error and close, which is annoying. You'll have to open the program again and start over with a new root, making the necessary changes until you get a good connection and Unison can compare your local and remote roots.

    At this point, using Unison with two roots—one local and one accessed via SSH—is pretty much like using Unison with two roots on the same machine. You're just connecting over a network, even the Internet, and all of your traffic is encrypted against prying eyes, which is always a bonus. Unison and SSH—two great technologies that go great together!

    Use Unison for incremental backup instead of rsync

    At the beginning of this chapter's discussion of Unison, I compared it to rsync and mentioned, as an example of Unison's superiority, that rsync only goes in one direction, making sure that directory B perfectly mirrors directory A, while Unison goes in two directions, making sure that changes in A are mirrored in B at the same time that changes in B are mirrored in A. If you want, however, you can use Unison to go in only one direction, just like rsync (not a huge surprise, since, as I stated, Unison is built using the rsync algorithm).

    Why would you want to do this? One word: backup. I use Unison in the "rsync" way to backup my home directory's configuration files to another machine. I want the copy to go only one way: from my laptop to my LAN's server. You can probably think of other ways you can use Unison to copy files in a single direction; heck, anytime you were planning to use rsync, you could just use Unison instead.

    As before, I open Unison and define a new profile, which I'll call "kde_config_files." For my first root, the one I'll be copying from, I'm going to select /home/rsgranne/.kde, the directory on my laptop Homer in which KDE stores all of my personalized config files. For my second root, the one I'll be copying to, I'm going to use /home/rsgranne/mnt/dante/home/backup/homer, a path on a machine named Dante, which I mount using Samba. Unison does its thing—it immediately starts doing an initial comparison of the two roots, since this is the first time that this Unison profile has run. After a few moments, it delivers its initial results:

    Initial results after Unison compares two roots.

    When you're backing up your .kde directory, you don't need the whole thing—just "Autostart" and "share." Therefore, I want to tell Unison to ignore the other files that it has found. To do this, select each file that you wish to leave out, and choose the Ignore menu and then Permanently ignore this path(or press Shift + i on your keyboard). Be careful when you make this choice: Unison will never ask you about that path again. As you choose to permanently ignore the path, it will immediately delete it from Unison's window.

    If you make a mistake, not to worry. You will need to manually edit the profile file for this particular profile—kde_config in this case—and remove the line that begins with ignore = Path and then follows with the path you no longer want to ignore.

    After you've excluded the files (and folders) that you don't want to track, it's now time to run Unison. Press Go, and watch as Unison moves all the files from Homer to Dante.

    We've copied over the files, but we haven't insured that Unison will act as rsync, and perform all synchronizations in one direction only, from left to right, or Homer to Dante. To do this, select one of the paths listed and choose the Actions menu, and then select Force all changes from first root to second. Any arrows will immediately show up pointing from left to right, and they will be blue, indicating that you wish changes to go in one way only: from root one to root two, from Homer to Dante. Any changes in Dante will be ignored.

    Fortunately, the next time you open this profile, you'll find that Unison has remembered the paths you told it to ignore, which is great. Unfortunately, you'll also find that Unison has forgotten that you want to force all copies to go from Homer to Dante, which ain't so great. I don't know about you, but I don't relish the thought of repeating that process over and over, simple though it is. I mean, this is a computer, and computers are supposed to automate things easily—so let's automate!

    This is actually very, very easy. Open this profile's file in the .unison directory. Mine would look like this:

    root = /home/rsgranne/.kde
    root = /home/rsgranne/mnt/dante/home/backup/homer

    ignore = Path cache-homer.localdomain
    ignore = Path socket-homer
    ignore = Path socket-homer.localdomain
    ignore = Path tmp-homer
    ignore = Path tmp-homer.localdomain

    Notice the lines that all begin with ignore—that's why Unison remembered to leave those paths out of future syncs. Now add this line after the two roots:

    force = /home/rsgranne/.kde

    Save the file, and rerun Unison. Success! Unison will now forever copy files from Homer to Dante only, and with no intervention on my part. Beautiful—I love it when computers help further my laziness.

    Tweaking your profiles

    Remember the profile files that I discussed earlier? They live in a hidden directory—.unison—which itself is in your home directory, and they end with the ".prf" extension. Before we finish our look at Unison, I wanted to mention a few other tweaks that you should add manually to those files in order to make your sync processes work a little more smoothly.

    In order to add these tweaks, you're going to need to open your profiles with your favorite text editor. In my case, I'm going to use "kde_config.prf," located in /home/rsgranne/.unison.

    To start, let's add the following to the file (it doesn't matter where you put these preferences, but I like to put them at the top of the file, after my two roots and before any "ignore" lines):

    times = true

    If you use Unison for any length of time, you're going to notice that a lot of files need their "props" synchronized. Complaining about "props" is just Unison's way of telling you that the date and time stamps for your files—which indicate when the files were modified—are not exact. By setting times to true, you are copying over not just the contents of the files, but their modification dates and times as well, which will vastly reduce any "props" complaints that you'll see.

    Note that this setting copies over modification timestamps for files only, not directories.

    In a similar vein, you may want to add the following:

    owner = true
    group = true

    When you add these two settings to your profile, you order Unison to synchronize the owners and groups of files along with the contents of those files, which can really help to keep things straight as you move files around between machines.

    If you want to synchronize users and groups by the real numbers that your Linux system uses instead of the names that we humans use (in other words, by 500 instead of rsgranne—if you don't know what I'm talking about, take a look at your /etc/passwd file), then add the preference numericids = true to your profile as well.

    If you're only using Unison to sync Linux boxes, then you can skip this next preference. But if you're involving Windows machines in your sync—say, from a Windows workstation to a Linux server, or from a Windows workstation to a Linux workstation—then you probably want to read the following carefully.

    fastcheck = yes

    You'll probably want to use fastcheck = yes most of the time, but every once in a while you'll want to comment that line out with a pound sign in front of it (#), run Unison, and then revert back to what I have above. Let me explain why.

    You might already have asked yourself a pretty important question: how the heck does Unison know that a file has changed? For files on a Linux box, Unison looks at their inode numbers and modtimes; if either of those is different, then Unison thinks that the file has changed. Windows boxes, of course, don't have inode numbers, so Unison's default for Windows is to scan the contents of each file to see if anything has changed. This is certainly a safe method, but if you have a lot of files to scan on a Windows machine, it's  going to take a long, long time. Failing to set the fastcheck preference then, or setting it to fastcheck = default, results in this behavior, which probably is not what you want to do.

    Don't know what an inode number is? Here's a handy explanation: http://en.wikipedia.org/wiki/Inode.

    If you set fastcheck to yes, however, then Unison acts differently on your Windows machines: it looks at the file creation times instead of scanning the contents of each file (Linux files still get the same treatment: their inode numbers and modtimes are checked). This results in a much faster scan of your Windows machine, but in very rare circumstances (a file has been changed, but you have somehow managed not to change the create time, the modification time, and the length of the file—pretty hard to do), Unison may refuse to make an update of some of the files on the Windows machine.

    Now, practically speaking, you don't have to worry about this very much. First, you'd have to really jump through hoops to fool Unison, and second, even if Unison thought that it should possibly overwrite such a file, it won't, as the Unison manual explains: "Unison will never overwrite \1uch an update with a change from the other replica, since it always does a safe check for updates just before propagating a change."

    Since this is the case, I would recommend leaving fastcheck set to yes most of the time, but, every once in a while, comment out that preference with a pound sign, so that the full contents of your Windows files will be checked on the next run. Then, after you've satisfied yourself that things are copacetic, uncomment fastcheck = yes and go back to the faster method.

    logfile = $HOME/.unison/unison.log

    When you run Unison, the program generates a logfile, which can be a very useful thing. The default, however, is to create this logfile—named "unison.log"—in your home directory, which I find annoying since I try to keep my home directory neat and tidy. If you don't mind this, then leave the logfile preference alone. But if you, like me, want to change the location in which the logfile goes, then set the preference above. I like the logfile to go into the hidden .unison directory, but you can change this any way you like.

    Further information about Unison

    The main site for all things Unison can be found at http://www.cis.upenn.edu/~bcpierce/unison/. Hopefully he'll get a better URL some day.

    On the main Unison site, you can download stable and beta versions of the software, and, most importantly, read the very complete and very informative Unison documentation (if you really get into Unison, then you should look into running it sans GUI; the documentation will tell you everything you need to know, although the GUI actually provides a good way to learn the basics). If that doesn't help, there is a good FAQ that covers several important questions, especially one that I was wondering myself: "Does Unison work on Mac OSX?" Be sure to read that answer before trying to use Unison on that OS!

    For further support, there are a few listservs you can join—or just search, if you don't want to join—at Yahoo! Groups. Unison-users is just that: a list for users of Unison. It currently has over 600 members, and they produce in the neighborhood of 100 messages a month, so it's not too overwhelming. You can get more information about the list at http://groups.yahoo.com/group/unison-users/. If you're a developer, then you might want to check out Unison-hackers, at http://groups.yahoo.com/group/unison-hackers. If all you want are the announcements when a new version is released, then you should subscribe to the low volume (one message per month) Unison-announce, which you can find at http://groups.yahoo.com/group/unison-announce, or just send a blank email to unison-announce-subscribe@groups.yahoo.com.

    Finally, Open magazine, a "weekly e-zine for Linux and Open Source computing in the enterprise," has a nice article about Unison available at http://www.open-mag.com/features/Vol_53/synch/synch.htm. They focus on getting Unison to work with Windows as well as Linux, and provide some technical details about fail-safe provisions that Unison provides.

    WebSanity Top Secret