1

I need to check if local file is same as remote host file.

The file locations are like below: File1 at Local machine ./remotehostname/home/a/b/scripts/xyz.cpp

File2 at remote machine remotehostname:/home/a/b/scripts/xyz.cpp

I intend to compare these 2 files, using the command

diff  ./remotehostname/home/a/b/scripts/xyz.cpp remotehostname:/home/a/b/scripts/xyz.cpp


 find . -type f | grep -v .svn |xargs -I % diff % 

I need to change % to take remotehost and compare the file. Not sure how to apply sed on %. Or is there a better way to compare such files.

One way could be to save the list of files and then apply sed on that file, but I think there should be an even better way. Also the diff doesnt work on remote hosts, maybe I need to use output of dry rsync?

user1977867
  • 373
  • 2
  • 4
  • 18
  • 4
    You should probably have a look at `rsync`. Its jobs is to compare file hierarchies (local or remote) and synchronize them if needed. It has a `--dry-run` option that just does the comparison, not the sync. – Renaud Pacalet Sep 09 '15 at 10:45
  • You can create a script that transforms the paths and does the comparison, and then run *it* from `xargs`. – choroba Sep 09 '15 at 10:50
  • What's the actual question here? You can use the argument to `%` as many times as you want in the command. – Etan Reisner Sep 09 '15 at 11:22
  • @EtanReisner % =local/file/path I want to compare local/file/path to remoteserver:/local/file/path. , mean I need to apply sed on % to create it as my second argument to diff command. eg. diff % %(with some pattern replaced) ie. if 1st %=file/path 2nd % should be remoteserver:file/path – user1977867 Sep 09 '15 at 18:10
  • Ah. Then you don't want to use `-I` at all and you want to pass the filename to a shell script as the argument to `xargs`. `.... | xargs sh -c 'diff "$1" "$(sed .... << – Etan Reisner Sep 09 '15 at 18:38

2 Answers2

3

This can be done with xargs, but I prefer to use while read in bash.

xargs method

find . -type f | grep -v .svn | sed 's/.*/& remotehostname:&/' | xargs -n2 diff

The sed command duplicates the input and makes whatever modifications you need. The xargs then passes the inputs to diff two at a time. This will not work if any filename contain spaces.

bash method

find . -type f | grep -v .svn | while read line; do
    diff "$line" "remotehostname:$line"
done

The bash read command reads a line from stdin, places it in the name variable, $line, and returns true. You can then put whatever you like inside the loop, so you get total freedom to rewrite the filename however you need. When the input runs out, read returns false, and the loop exits.

Note that piping things into loops has some interesting side effects that are not relevant here, but might bite you one day.

Community
  • 1
  • 1
ams
  • 22,492
  • 4
  • 47
  • 71
0

If you are interested in the actual difference (and not just whether they differ - which rsync is brilliant for telling you) then you can do this using GNU Parallel:

find . -type f | grep -v .svn |
  parallel diff {} '<(ssh {= s:./::;s:/.*:: =} cat {= s:([^/]+/){2,2}::;$_=::shell_quote_scalar($_) =})'

s:./::;s:/.*:: = hostname from path
s:([^/]+/){2,2}:: = rest of path
::shell_quote_scalar = \-quote special chars as needed by the shell

GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to. It can often replace a for loop.

If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:

Simple scheduling

GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:

GNU Parallel scheduling

Installation

If GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:

(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash

For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README

Learn more

See more examples: http://www.gnu.org/software/parallel/man.html

Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html

Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel

Ole Tange
  • 27,221
  • 5
  • 71
  • 88
  • Thanks, but I am still missing the trick to modify path to remoteserver. eg. my local file path is ./remoteservername/a/b/c/d.cpp and file to comapare with is at remoteservername:/a/b/c/d.cpp . So somehow I need to extract remoteservername from filepath – user1977867 Sep 10 '15 at 05:55