Normally, git-annex repositories consist of symlinks that are checked into git, and in turn point at the content of large files that is stored in .git/annex/objects/. Direct mode gets rid of the symlinks.

The advantage of direct mode is that you can access files directly, including modifying them. The disadvantage is that most regular git commands cannot safely be used, and only a subset of git-annex commands can be used.

Normally, git-annex repositories start off in indirect mode. With some exceptions:

  • Repositories created by the assistant use direct mode by default.
  • Repositories on FAT and other less than stellar filesystems that don't support things like symlinks will be automatically put into direct mode.
  • Windows always uses direct mode.

enabling (and disabling) direct mode

Any repository can be converted to use direct mode at any time, and if you decide not to use it, you can convert back to indirect mode just as easily. Also, you can have one clone of a repository using direct mode, and another using indirect mode; direct mode interoperates.

To start using direct mode:

git annex direct

To stop using direct mode:

git annex indirect

safety of using direct mode

With direct mode, you're operating without large swathes of git-annex's carefully constructed safety net, which ensures that past versions of files are preserved and can be accessed. With direct mode, any file can be edited directly, or deleted at any time, and there's no guarantee that the old version is backed up somewhere else.

So if you care about preserving the history of files, you're strongly encouraged to tell git-annex that your direct mode repository cannot be trusted to retain the content of a file. To do so:

git annex untrust .

On the other hand, if you only care about the current versions of files, and are using git-annex with direct mode to keep files synchronised between computers, and manage your files, this should not be a concern for you.

use a direct mode repository

You can use most git-annex commands as usual in a direct mode repository. A very few commands don't work in direct mode, and will refuse to do anything.

Direct mode also works well with the git-annex assistant.

The most important command to use in a direct mode repository is git annex sync. This will commit any files you have run git annex add on, as well as files that were added earlier and have been modified. It will push the changes to other repositories for git annex sync there to pick up, and will pull and merge any changes made on other repositories into the local repository.

While you generally will just use git annex sync, if you want to, you can use git commit --staged, or plain git commit. But not git commit -a, or git commit <file> .. that'd commit whole large files into git!

what doesn't work in direct mode

git annex status shows incomplete information. A few other commands, like git annex unlock don't make sense in direct mode and will refuse to run.

As for git commands, you can probably use some git working tree manipulation commands, like git checkout and git revert in useful ways... But beware, these commands can replace files that are present in your repository with broken symlinks. If that file was the only copy you had of something, it'll be lost.

This is one more reason it's wise to make git-annex untrust your direct mode repositories. Still, you can lose data using these sort of git commands, so use extreme caution.

So, just which git commands are safe? It seems like I'm going to have to use direct mode, so it'd be nice to know just what I'm allowed to do, and what the workflow should be.

All git commands that do not change files in the work tee (and do not stage files from the work tree), are safe. I don't have a complete list; it includes git log, git show, git diff, git commit (but not -a or with a file as a parameter), git branch, git fetch, git push, git grep, git status, git tag, git mv (this one is somewhat surprising, but I've tested it and it's ok)

git commands that change files in the work tree will replace your data with dangling symlinks. This includes things like git revert, git checkout, git merge, git pull, git reset

git commands that stage files from the work tree will commit your data to git directly. This includes git add, git commit -a, and git commit file

Comment by http://joeyh.name/ Tue Feb 19 02:55:13 2013

So, if I edit a "content file" (change a music file's metadata, say), what's the workflow to record that fact and then synchronise it to other repositories?

I can't do a git add, so I don't understand what has to happen as a first step. (Thanks for your quick reply above, BTW.)

git annex add $file
git commit -m changed
git annex sync
git annex copy $file --to otherrepo
Comment by http://joeyh.name/ Tue Feb 19 03:05:35 2013

What happens to the object database (.git/annex/objects) when going to direct mode? Are the objects deleted, moved to another location, kept?

If the objects are kept, does it means that the file on the repository in direct mode is duplicated in the object database? If so, would it be relevant to use cp --reflink=auto to populate the working directory to enable copy on write on filesystems that supports it?

Comment by mildred Mon Jul 8 13:27:21 2013
.git/annex/objects does not typically contain any file contents in direct mode. The file contents are stored directly in the working tree.
Comment by http://joeyh.name/ Mon Jul 8 16:11:32 2013

Would it be safe to add largefiles to gitignore in direct mode?

Can git-annex still track large files ignored by git?

Thanks. :-)

Comment by http://caust1c.myopenid.com/ Mon Aug 12 18:06:21 2013

asbraithwaite: No, as far as I know it can not.

Comment by arand Mon Aug 12 18:12:32 2013

I'd like to have an indirect mode repo on my laptop cloned on a cifs mount point (mounted off an SMB NAS) thus in direct mode. But all I can see on the clone after merge/pull is text files of length 207 chars containg the symlink in plain text.

I guess this is what git manages internally for the symlinks... so I'm afraid git annex doesn't work in such case.

Can you confirm that indirect and direct modes can coexist on clones of the same repo ?

Comment by http://olivier.berger.myopenid.com/ Sat Aug 17 20:35:40 2013

Re-reading @joey's reponse above, I see that merge/pull don't seem to be safe and will create dangling symlinks. That corresponds to those files I can see on cifs, I guess.

But then, how can a direct repo sync with changes made in other remotes, if there no pull/fetch available.

Can it then be only the source of changes which will propagate to indirect remotes ?

Comment by http://olivier.berger.myopenid.com/ Sat Aug 17 21:53:57 2013

I too have issues with mixing direct and indirect mode repositories.

I have a regular, existing repository with ebooks, shared between various clones on proper :) filesystems; now I would need a copy of some of them on an ereader which only offers a FAT filesystem, so it has to be direct mode.

mount $READER
cd $reader
git clone $REPO

I get a directory full of small files, the way git manages links on FAT.

git annex init "ebook reader"

This detects the fact that it is working on a crippled filesystem, enables direct mode and disables ssh connection caching; up to now everything seems to be fine, but then

git annex get $SOME_BOOK

seems to work, downloads the file somewhere, but when I try to open $SOME_BOOK it is still the fake link, and the file has been downloaded in its destination, as if the repo wasn't in direct mode.

I use version 4.20130723 on debian jessie

Comment by http://www.gl-como.it/author/valhalla/ Sun Aug 18 08:47:35 2013

There should be no obstacles to using direct mode on one clone of a git repository, and indirect mode on another clone. The data stored in git for either mode is identical, and I do this myself for some repositories.

@valhalla, you probably need to run git annex fsck, and if that does not solve your problem, you need to file a bug report.

Comment by http://joeyh.name/ Fri Aug 23 17:48:54 2013

@obergix asked:

But then, how can a direct repo sync with changes made in other remotes, if there no pull/fetch available.

The answer is simple: By running git annex sync, which handles all that.

Comment by http://joeyh.name/ Fri Aug 23 17:50:15 2013

Thanks for these details @joeyh. But AFAIU, one needs to proceed to the git annex copy before doing the git annex sync, otherwise, symlinks (or files containing the symlink path on SMB) will be created, instead of the plain "direct" files that are expected.

I'm still not sure whether the git annex sync needs to be issued on either of the indirect or direct remotes first, or both, then in which sequence. I think a "walkthrough" script would help.

Comment by http://olivier.berger.myopenid.com/ Fri Aug 23 19:59:35 2013
No, you can sync before you copy, get, or whatever. git-annex will replace the symlinks with the actual files when they arrive at the repository.
Comment by http://joeyh.name/ Sat Aug 24 15:56:47 2013
Comments on this page are closed.