1

Good evening,

I want to write a bash script that loops over all the files in a directory and if a file matches a regular expression, it outputs the filename and some additional info [using cat] to a txt file.

The script will be used to label an imageset to later create an LMDB to use in caffe.

Here is my attempt;

#!/bin/bash
for f in /absolutepath/train/*
do
  if [$f == '/absolutepath/train/felix.*']
  then $f cat ' 0' > train.txt
  elif [$f == '/absolutepath/train/jonas.*']
  then $f cat ' 1' > train.txt
  elif [$f == '/absolutepath/train/joachim.*']
  then $f cat ' 2' > train.txt
  elif [$f == '/absolutepath/train/vriendinjoachim.*']
  then $f cat ' 3' > train.txt
  else $f cat ' 4' > train.txt
  fi
echo "Done :D"
done

the files in the directory look like this: felix (1).jpg, felix (2).jpg,.....

If you know of an existing script that can do this for me don't hesitate to mention that too.

PS: this is only my second post so don't be harsh :)

Xilef
  • 61
  • 10

1 Answers1

4

A few changes from your original:

  • the space around the [ is critical, as [ is a shell built-in and/or external command, and so the shell needs the space to delimit the words in order to find the right command.
  • using Cyrus' regex syntax is one way to find matching files; below, I use a case statement to use regular pattern-matching. The case syntax in the script uses surrounding parenthesis to delimit the pattern; since the (new) pattern contains spaces and parenthesis, I've escaped them with \.
  • on the topic of pattern-matching the filenames, I've taken your comment regarding the filenames and used that as part of the requirement for the filename; as a result, files named something like "felix.jpg" or "felixnon-matching.jpg" will fall through to the default value of 4.
  • your syntax of $f cat 3 would have tried to execute the filename instead of echoing it; I've replaced that bit with printf.
  • every time your for loop executed, it would have overwritten the previous contents of train.txt, so I've changed the single > to >> to append the new contents.
  • I've moved the echo Done statement outside of the for loop so that you only see it once the script is really done (otherwise, you'd see it for every file).
  • On a final note, the contents of train.txt are going to be tricky to parse again; not knowing how you'll do that, I've left two printf statements in the loop; one prints the filename first, followed by the value; the other (commented-out one) prints the value followed by the filename. I would recommend printing the value first, as it'll be easier to say "for each line, read the integer value first, then everything else as the filename" versus trying to find the end of the filename followed by an integer. Either way, the values are separated by a tab \t to assist in those efforts.

Here's the new script:

#!/usr/bin/env bash
for f in /absolutepath/train/*
do
  value=4
  case "$f" in
    ( /absolutepath/train/felix\ \(*\).jpg )
        value=0
        ;;
    ( /absolutepath/train/jonas\ \(*\).jpg )
        value=1
        ;;
    ( /absolutepath/train/joachim\ \(*\).jpg )
        value=2
        ;;
    ( /absolutepath/train/vriendinjoachim\ \(*\).jpg )
        value=3
        ;;
    (*)
        value=4
        ;;
  esac
  #printf '%d\t%s\n' "$value" "$f" >> train.txt
  printf '%s\t%d\n' "$f" "$value" >> train.txt
done
echo "Done :D"
Community
  • 1
  • 1
Jeff Schaller
  • 818
  • 2
  • 12
  • 27
  • Thanks a lot, I have learned a few things :) Thank you for the final note also, I do not know yet how caffe handles these labeling files yet. I still have a few minor questions; 1. I named a script with 2 underscores and it was long, and it didn't execute [i did chmod u+x], so I assume there are restrictions for the name of the script? 2: Is #!/usr/bin/env bash necesary in every script? What are the most common environments and what does this do exactly? – Xilef Nov 28 '16 at 12:06
  • underscores shouldn't matter; you might want to compose a question regarding that on Unix&Linux; and see http://unix.stackexchange.com/questions/29608/why-is-it-better-to-use-usr-bin-env-name-instead-of-path-to-name-as-my for the env question – Jeff Schaller Nov 28 '16 at 12:11