![]() |
|
Questions:I restore/save all files but dar reported some files have been ignored, what are those ignored files?Dar hangs when using it with pipes, why? Why, when I restore 1 file, dar report 3 files have been restored? While compiling dar I get the following message : " g++: /lib/libattr.a: No such file or directory", what can I do? I cannot find the binary package for my distro, where to look for? Why, does dar reports "ignored" files while I make a backup without filter? Can I use different filters between a full backup and a differential backup? Would not dar consider some file not included in the filter to be deleted? Once in action dar makes all the system slower and slower, then it stops with the message "killed"! How to overcome this problem? I have a backup I want to change the size of slices? I have a backup in one slice, how can I split it in several slices? I have a backup in several slice, how can I stick all them in a single file? I have a backup, how can I change its encryption scheme? I have a backup, how can I change its compression algorithm? Which options can I use with which options? Why dar reports corruption for the archive I have transfered with FTP? Why DAR does save UID/GID instead of plain usernames and usergroups? Dar_Manager does not accept encrypted archives, how to workaround this? How to overcome the lack of static linking on MacOS X? Why cannot dar use the full power of my multi-processor computer? Is libdar thread-safe, which way you mean it is? How to solve "configure: error: Cannot find size_t type"? How to search for questions (and their answers) about known problems similar to mines? Why dar tells me that he failed to open a directory, while I have excluded this directory? Answers:
I restore/save all files but dar reported some files have
been ignored, what are those ignored files? |
-c |
-x |
-l |
-d |
-t |
-C |
-+ |
|
-v |
OK |
OK |
OK |
OK |
OK |
OK |
OK |
-vs |
OK |
OK |
-- |
OK |
OK |
-- | OK |
-b |
OK |
OK |
OK |
OK |
OK |
OK |
OK |
-n |
OK |
OK |
-- | -- | -- | OK |
OK |
-w |
OK | OK | -- | -- | -- | OK | OK |
-wa |
-- | OK | -- | -- | -- | -- | -- |
-R |
OK | OK | -- | OK | -- | -- | -- |
-X |
OK | OK | OK | OK | OK | -- | OK |
-I |
OK | OK | OK | OK | OK | -- | OK |
-P |
OK | OK | -- | OK | OK | -- | OK |
-g |
OK | OK | -- | OK | OK | -- | OK |
-] |
OK | OK | -- | OK | OK | -- | OK |
-[ |
OK | OK | -- | OK | OK | -- | OK |
-u |
OK | OK | -- | -- | -- | -- | OK |
-U |
OK | OK | -- | -- | -- | -- | OK |
-i |
OK | OK | OK | OK | OK | OK | OK |
-o |
OK | OK | OK | OK | OK | OK | OK |
-O |
OK | OK | -- | OK | -- | -- | -- |
-H |
OK | OK | -- | -- | -- | -- | -- |
-E |
OK | OK | OK | OK | OK | OK | OK |
-F |
OK | -- | -- | -- | -- | OK | OK |
-K |
OK | OK | OK | OK | OK | OK | OK |
-J |
OK | -- | -- | -- | -- | OK | OK |
-# |
OK | OK | OK | OK | OK | OK | OK |
-* |
OK | -- | -- | -- | -- | OK | OK |
-B |
OK | OK | OK | OK | OK | OK | OK |
-N |
OK | OK | OK | OK | OK | OK | OK |
-e |
OK | -- | -- | -- | -- | OK | OK |
-aSI |
OK | OK | OK | OK | OK | OK | OK |
-abinary |
OK | OK | OK | OK | OK | OK | OK |
-Q |
OK | OK | OK | OK | OK | OK | OK |
-aa |
OK | -- | -- | OK | -- | -- | -- |
-ac |
OK | -- | -- | OK | -- | -- | -- |
-am |
OK | OK | OK | OK | OK | OK | OK |
-an |
OK | OK | OK | OK | OK | OK | OK |
-acase |
OK | OK | OK | OK | OK | OK | OK |
-ar |
OK |
OK |
OK |
OK |
OK |
OK |
OK |
-ag |
OK |
OK |
OK |
OK |
OK |
OK |
OK |
-j |
OK | OK | OK | OK | OK | OK | OK |
-z |
OK | -- | -- | -- | -- | OK | OK |
-y |
OK | -- | -- | -- | -- | OK | OK |
-s |
OK | -- | -- | -- | -- | OK | OK |
-S |
OK | -- | -- | -- | -- | OK | OK |
-p |
OK | -- | -- | -- | -- | OK | OK |
-@ |
-- | -- | -- | -- | -- | -- | OK |
-$ |
-- | -- | -- | -- | -- | -- | OK |
-~ |
-- | -- | -- | -- | -- | -- | OK |
-% |
-- | -- | -- | -- | -- | -- | OK |
-D |
OK | -- | -- | -- | -- | -- | OK |
-Z |
OK | -- | -- | -- | -- | -- | OK |
-Y |
OK | -- | -- | -- | -- | -- | OK |
-m |
OK | -- | -- | -- | -- | -- | OK |
-ak |
-- |
-- |
-- |
-- |
-- |
-- |
OK |
-af |
OK |
-- |
-- |
-- |
-- |
-- |
-- |
--nodump |
OK | -- | -- | -- | -- | -- | -- |
-G |
OK | -- | -- | -- | -- | OK | OK |
-M |
OK | -- | -- | -- | -- | -- | -- |
-, |
OK | -- | -- | -- | -- | -- | -- |
-k |
-- | OK | -- | -- | -- | -- | -- |
-r |
-- | OK | -- | -- | -- | -- | -- |
-f |
-- | OK | -- | -- | -- | -- | -- |
-ae |
-- | OK | -- | -- | -- | -- | -- |
-T |
-- | -- | OK | -- | -- | -- | -- |
-as |
-- | -- | OK | -- | -- | -- | -- |
-q |
OK |
OK |
OK |
OK |
OK |
OK |
OK |
Why DAR does save UID/GID
instead of plain usernames and usergroups?
In each file property there is
not present the name of the owner nor the name of the group owner, but
instead are present two numbers, the user ID and the group ID (UID
& GID in short). In the /etc/password file theses numbers are
associated names and other properties, like the login shell, the home
directory, the password (see also /etc/shadow).
Thus, when you do a directory list (with the 'ls' command for example
or with any GUI program for another example), the listing application
used
does open each directory, there it finds a list of name and a inode
number associated, then the listing program fetchs the inode attributes
for each file and looks among other information for the UID and the
GID. To be able
to display the real user name and group name, the listing application
calls a given standard C library call that will do the lookup in
/etc/password, eventually NIS system if configured and any other
additional
system, [this way applications have not to bother with the many system
configuration possible, the same API interface is used whatever is the
system], then lookup returns the name if it exist and the listing
application display for each
file found in a directory the attributes and the user name and group
name as returned by the system. As you can see, the user name and
group name are not part of any file attribute, but UID and GID *are*
instead. Dar is a backup tool mainly, it does preserve at much as
possible the files property to be able to restore them as close as
possible
to their original state. Thus a file saved with UID=3 will be restored
with UID=3. The name corresponding the UID 3 may exist or not,
may exist and be the same or may exist and be different, the file will
be anyway restored in UID 3. Thus, when doing backup and
restoration of a crashed system you can be confident, the restoration
will not interfere with the bootable system you have used to launch dar
to restore your disk. Assuming you have UID 1 labeled 'bin' in your
real crashed system, but this UID 1 is labeled 'admin' in the boot
system, while UID 2 is labeled 'bin'
in this boot system, files owned
by bin in the system to
restore will be restored under UID 1, not UID 2
which is used by the temporary boot system. At that time after
restoration still running the from the boot system, if you do a 'ls'
you will see that the original files
owned by 'bin' are now owned
by user 'admin'. This is really a mirage: in your
restoration you will also restore the /etc/password
file and other
system configuration files (like NIS configuration files if they have
been used),
then at reboot time on the newly restored real system, the UID 1 will
be backed associated to user 'bin'
as expected and files originally owned by user bin will now been listed as owned
by bin as expected. If dar had done else, restoring
the files owned by 'bin' to
the UID corresponding to 'bin',
theses
files would have been given UID 2 (the one used by the temporary
bootable system used to launch dar). But once the real restored system
would
have been launched, this UID 2 would have become some other user and
not 'bin' which is mapped to
UID 1 in the restored /etc/password. Now, if you want to change some UID/GID when moving a set of
files from
one live system to another system, there is no problem if you are not
restoring dar under the 'root'
account. Other account than 'root'
are
usually not allowed to modify UID/GID, thus restored files by dar will
have group and user ownership of the dar process, which is the one that
has launched dar. But if you really need to move a
directory tree containing a set of files with different ownership and
you want to preserve theses different ownership from one live system to
another, while the corresponding UID/GID do not match between the two
system, dar can still help you:
find /path/to/restored/archive
-uid <old UID> -print -exec chown <new name> {} \;
find /path/to/restored/archive
-gid <old GID> -print -exec chgrp <new name> {} \;
The first command will let you remap an UID to another for all files
under the /path/to/restored/archive directoryThe second command will let you remap a GID to another for all files under the /path/to/restored/archive directory Example on how to globally modify ownership of a directory tree user by user For example, you have on the source system three users: Pierre
(UID
100), Paul (UID 101), Jacques (UID 102)
but on the destination system, theses same users are mapped to different UID: Pierre has UID 101, Paul has UID 102 and Jacques has UID 100. We temporary need an unused UID on the destination system, we will assume UID 680 is not used. Then after the archive restoration in the directory /tmp/A we will do the following: find /tmp/A -uid 100 -print -exec
chown 680 {} \;
find /tmp/A -uid 101 -print -exec chown pierre {} \; find /tmp/A -uid 102 -print -exec chown paul {} \; find /tmp/A -uid 680 -print -exec chown jacques {} \; which is: change files of UID 100 to UID 680 (the files of Jacques are now under the temporary UID 680 and UID 100 is now freed) change files of UID 101 to UID 100 (the files of Pierre get their UID of the destination live system, UID 101 is now freed) change files of UID 102 to UID 101 (the files of Paul get their UID of the destination live system, UID 102 is now freed) change files of UID 680 to UID 102 (the files of Jacques which had been temporarily moved to UID 680 are now set to their UID on the destination live system, UID 680 is no more used). You can then move the modified
files to appropriated destination or
make a new dar archive to be restored in appropriated place if you want
to use some of dar's feature like for example only restore files that
are more recent than those present on filesystem.
Dar_Manager
does not accept
encrypted archives, how to workaround this?Yes, that's true, dar_manager does not accept encrypted archives. The first reason is that while dar_manager database cannot be encrypted this is not very fair to add them encrypted archives. The second reason is because the dar_manager database should hold the key for each encrypted archive making this archive the weakest point in your the data security: Breaking the database encryption would then provide access to any encryption key, and with original archive access it would bring access to data of any of the archive added to the database. OK, there is however a feature in the pipe to provide to dar_manager the support to encrypt its archives, then next another feature to provide dar_manager the possibility to store the different archive keys, then is needed another feature to have key being passed from dar_manager to dar out of command-line (which would expose the keys to the sight of other users on your multi-user system), then yet another feature to be able to feed the database with the archive keys also without using the command-line. ... well there is a lot of feature to add and test before you can expect finding it in a released version of dar. In the meanwhile, you can proceed as follows:
When will come the time to use dar_manager to restore some file, you will have to make dar_manager pass the key to dar for it be able to restore the needed files from the archive. This can be done in several ways: dar_manager's command-line, dar_manager database or dar.dcf file.
How to overcome the lack of static linking on MacOS X? The answer comes from Dave Vasilevsky in an email to the dar-support mailing-list. I let him explain how to do: Pure-static
executables aren't used on OS X.
However, Mac OS X does have other ways to build portable binaries.
HOWTO build portable binaries on OS X?
First, you have to make sure that dar only uses operating-system libraries that exist on the oldest version of OS X that you care about. You do this by specifying one of Apple's SDKs, for example: export
CPPFLAGS="-isysroot /Developer/SDKs/MacOSX10.2.8.sdk"
export LDFLAGS="-Wl,-syslibroot,/Developer/SDKs/MacOSX10.2.8.sdk" Second, you have to make sure that any non-system libraries that dar links to are linked in statically. To do this edit dar/src/dar_suite/Makefile, changing LDADD to '../libdar/.libs/libdar.a'. If any other non-system libs are used (such as gettext), change the makefiles so they are also linked in statically. Apple should really give us a way to force the linker to do this automatically! Some caveats: * If you build for 10.3 or lower, you will not get EA support, and therefore you will not be able to save special Mac information like resource forks. * To work on both ppc and x86 Macs, you need to build a universal binary. For instructions, use Google ![]() * To make a 10.2-compatible binary, you must build with GCC 3.3. * These instructions won't work for the 10.1 SDK, that one is harder to use. Why cannot dar use the full power of my multi-processor computer? Parallel computing programming is
a science by itself. For having done a specialization in that area
during my studies, I can explain briefly here the constraints. A
program can use several processor if the algorithm it uses is able to
be parallelized. Such an algorithm can either statically (at
programming time) or dynamically (at execution time) be cut in several
independent execution threads. Theses different execution threads must
be as much autonomous as possible between them, if you don't want to
have one thread waiting for another (which is not what we want). The
constraint is this: if you cannot have different threads with no or
very little communication and dependence then parallelization does not
worth it.
Back to dar. From a very abstracted point of view, dar works by fetching files from the filesystem and by appending their data in a single file (the archive). For each file, dar records in memory the location of the data and once all files have been treated, this location information (contained in the so called "catalogue") is added at the end of the archive. One could say that to parallelize file treatment, instead of proceeding file by file, let's do all file at the same time (or rather let's say N files at the same time). OK, but first you would have an important loss of performance at disk level as the disk heads would spend most of the time seeking from one of the N file's data to another of the N file's data. The second point would be that to add a file to the archive you must know the position of the end of the last added file, which is not possible to know in advance because of compression and/or encryption. thus a given thread would have to wait that another has finished to be able to drop in turn the data of the file it owns... As you can guess, parallelizing this way would bring worse performance than the sequential algorithm. Another possibility is to have several thread doing :
OK, you have maybe found also another possibility : having N threads for compression and M threads for encryption. Assuming encryption is faster than compression, we could choose N > M. We could also have a fixed value for N and a dynamic value for M depending on how fast compression is running. Well, this would let dar be able to compress and encrypt several files at the same time, assuming that reading data and data writing time is negligible compared to compression time (which must be demonstrated as several files have potentially to be read at the same time), we could maybe have a real performance gain. But, ... while several files can now be compressed at the same time, only one can be written to disk at a given time. Thus, during the time the compression of a file has started and the time it has finished all other threads have to keep their compressed data in memory. Then a next thread can drop its data to the archive while all other keep compressing to memory (RAM). We will quickly lack of RAM! Or your computer will start to swap, or you have to store the data back to disk in a temporary file, which file will have to be read again and wrote back to archive. So, doing so will bring huge disk performance degradation, as disk will server for read file's data, writing its compressed data to temporary file, reading back its compressed data, writing its compressed data to archive. Last, when using parallelization there is a always a cost due to inter-process communication and concurrent I/O operations on the hardware (here, hard disk are used at the same time to read files to backup and to write them into the archive). This cost becomes negligible when the number of parallel thread increase, assuming all thread are well busy ... here there is a bottleneck, which is the archive creation that seems to avoid a real impressive parallelization. Conclusion, unless you can find another way to parallelize dar, it will not bring noticeable improvement to have a parallelized version of dar. Parallelization is strongly related to the algorithm used, some algorithms are well adapted to this operation some others are not. Is libdar thread-safe, which way you mean it is? libdar is the part of dar's
source code that has been rewritten to be used by external programs
(like kdar). It has been modified to be used in a multi-threaded
environment, thus, *yes*, libdar is thread-safe.
However, thread-safe does not mean that you do not have to take some
precautions in your programs while using libdar (or any other library). How to solve "configure: error: Cannot find size_t type"?Let's take an example, considering a simple library that provides two functions that both receive the address of an integer as argument. The first increments the given integer up to an specific user key pressed, while the second decrements the given integer up to another user key pressed. This library is thread-safe in the way that there is no static variable in it nor it has any given state at a particular time. It is just a set of two functions. Now, your multi-threaded program is the following: at a given time you have one thread running the first library function while another runs the other library function. All will work fine unless you provided to both threads the same integer. One thread would then increment it while the other would decrement it, and you would not have the expected behavior you could get if you were not using multi-threaded environment. The problem would be the same if instead of using an external library you were accessing this same integer from two different threads at the same time. Care must thus be taken for two different threads not acting on the same variables at the same time. This is however possible with the use of posix mutex, which would define a portion of code (known as a critical section) that cannot be entered by a thread while another one is accessing it (such a thread is suspended until the other thread exits the critical section). For libdar, this is the same, you must pay attention not having two or more different threads acting on the same data. Libdar provides a set of classes, which can be seen as a set of type (like a C struct) with associated functions (known as methods in the object oriented world). From theses classes, your program will create objects: each object *is* a variable. Technically, invoking a method on an object is exactly the same as invoking a function giving it as hidden argument a pointer to the object ; while semantically, invoking a method is a way to read or modify this variable (= the object). Thus, if you plan to act on a given object from several threads at the same time, you must use posix mutex or any other mean to mutually exclude the access to this object between all your threads, this way only one thread may read or modify this variable (=this object) at a given time. Note that internally libdar uses some static variables. By static variables, I mean variable that exist even when no thread is running a libdar function or method. Theses variables are enclosed in critical sections for libdar's user may use it normally. In other words, this is transparent to you. For example, to cancel a libdar call, the mechanism uses an array in which the tid (thread id) by which a call is ran must be canceled: If you wish to cancel a libdar call ran by thread 10, another thread will add the tid 10 to this list. At regular checkpoints, all libdar function check that this same list does not contain the tid the call is ran from. If so, the call aborts/returns and the thread can continue its execution out of libdar code. As you see, several thread may read or write this array of tid at the same time. thanks to a set of mutex this is transparent to you and for this reason, libdar can be said to be thread-safe. This error shows when you lack support for C++ compilation. Check the gcc compiler has been compiled with C++ support activated, or if you are using gcc binary from a distro, double check you have installed the C++ support for gcc. Before sending an email to the dar-support mailing-list, you are asked to first look in the already sent email if there your problem has not been yet exposed and solved. This will first for you the fastest way to get an answer to your problem, and for me a way to preserve time for development. But yes, there is now tones of emails to read to have a chance to find the answer to your problem. Hopefully, there is a search engine at gmane (see the dark green area at the bottom of the page). This search engine is available for all the mailing list archived at gmane used around dar. Reading the contents of a
directory is done using the usual system call
(opendir/readdir/closedir). The first call (opendir) let dar design
which directory to inspect, the dar call readdir to get the next entry
in the opened directory. Once nothing has to be read, closedir is
called. The problem here is that dar cannot start reading a directory
do some treatment and start reading another directory. In brief, the
opendir/readdir/closedir system call are not re-entrant.
This is in particular critical for dar as it does a depth lookup in the directory tree. In other words, from the root if we have two directories A and B, dar reads A's contents, the contents of its subdirectories, then once finished, it read the next entry of the root directory (which is B), then read the contents of B and then of each of its subdirectories, then once finished for B, it must go back to the root again, and read the next entry. In the meanwhile dar had to open many directories to get their contents. For this reason dar caches the directory contents (when it first meet a directory, it read its whole content and stores it in the RAM). This is only after, that dar decide whether to include or not a given directory. But at this point then, its contents has already been read thus you may get the message that dar failed to read a given directory contents, while you explicitly specify not to include that particular directory in the backup. |