
Compare the contents of Remote and Local Directories
Source:R/boxr__internal_dir_comparison.R
box_dir_diff.Rd
box_dir_diff
is the internal function used by
box_fetch()
and box_push()
to determine how to
which files and folders should be uploaded/downloaded, updated,
or deleted, to synchronize remote and local directories.
Arguments
- dir_id
The id of the box.com folder which you'd like to use for the comparison
- local_dir
The path of the local folder which you'd like to use for the comparison
- load
logical
. Should the results be in the context of an upload or a download operation? Permitted values are"up"
or"down"
- folders
logical
. Should folders/directories be included in the result?
Value
An object of class boxr_dir_comparison
, describing the
differences between the files.
It is a named list, it's entries containing data frames, describing the files in each of the following categories:
new
Files which are present in the origin, but not the destination. These will be downloaded by
box_fetch()
/uploaded bybox_push()
.superfluous
These are files which are present in the destination, but not the origin. If
delete
is set toTRUE
inbox_fetch()
/box_push()
, they will be deleted.to_update
Files which are present in both the origin and the destination, but which have more recently modified copies in the origin. If downloading with
box_fetch()
, andoverwrite
set toTRUE
, new files will overwrite existing local copies. If uploading withbox_push()
(andoverwrite
set toTRUE
), the new version will be uploaded to box.com, with a new version number, and the old version still being available.up_to_date
Files present in both origin and destination, with the same content. Note: A file may be modified at later date, but if it has identical contents according to it's
sha1
hash, it will be considered up-to-date.box_fetch()
/box_push()
do nothing for these files.behind
Files which are present in both origin and destination, but where the content differs, and the version in the destination has been more recently updated.
box_fetch()
/box_push()
do nothing for these files.new_folders
Analogous to the file operation, but for directories/folders.
superfluous_folders
Analogous to the file operation, but for directories/folders.
Details
box_dir_diff
works by comparing files in the 'origin' to
those in the 'destination'.
For downloading files (e.g. with box_fetch()
), the origin is
the remote folder on box.com specified with dir_id
, and the
destination would be the local directory specified by local_dir
.
The reverse is true for uploads (e.g. via box_fetch()
).
box_dir_diff
decides what should happen to a file based on three
- Presence
Is the file present in both the origin and destination? The filename (within the directory structure) is used to determine this.
- Content
If a file is present in both the origin and the destination, does it have the same content? The definition comes from the file's
sha1
hash, which for local files is determined using thedigest::digest()
function from the package of the same name. For remote files, it is queried from the box.com API.- Modification Date
If a file is present in both the origin and destination, and the content is different in each, boxr will prefer the file which was most recently modified.
For local files, the 'content modified time' is used; the
mtime
variable returned byfile.info()
.For remote files, the
modified_at
date returned by the box.com API. This is the time that the file was modified on the box.com servers, as opposed to the time that the content itself was modified.
Why not use the content modified time for both?
With regards to the box.com API, modified_at
is preferred to
content_modified_at
, as it includes changes to the file outside of
just it's content. This means that, for example, a collaborator could
roll back to a previous version of a file, or upload a preferred but
older version. These actions count as modifications on the box.com
servers, but not to the content of the file itself (they are reflected
in modified_at
, but not content_modified_at
).
Implementing similar functionality for local files is not possible in a platform-independent manner; content modified time is the only file-based timestamp which has a consistent definition for UNIX and Windows systems.
See also
box_fetch()
and box_push()
, which depend on
this internal function, file.info()
for timestamps describing
local files, digest::digest()
for details of the sha1
algorithm
implementation.