0

In my second if statement, I want to filter out "tool" or "tool.bat" from the final list of filenames. However, the final list of filenames includes "tool" and total_bags is being incremented. I was wondering what I did incorrectly that's causing the program to not catch this case.

set /A total_bags=0
set target=%~1
if "%target%"=="" set target=%cd%

set LF=^


rem Previous two lines deliberately left blank for LF to work.

for /f "tokens=1 delims=. " %%i in ('dir /b /s /a:-d "%target%"') do (
    set current_file=%%~ni
    echo !unique_files! | find "!current_file!:" > nul
    if NOT !ERRORLEVEL! == 0 (
        if NOT !current_file! == "tool.bat" (
            set /A total_bags=total_bags+1
            set unique_files=!unique_files!!current_file!:
        )
    )
)

echo %unique_files::=!LF!%
echo %total_bags%
endlocal
aschipfl
  • 28,946
  • 10
  • 45
  • 77

1 Answers1

1

The condition if NOT "%current_file%" == "tool.bat" as initially used does not work because of %current_file% is replaced already by current string of the environment variable current_file respectively an empty string on Windows command processor is processing the entire command block starting with ( and ending with matching ) before executing command FOR. That can be seen on debugging the batch file. See also Variables are not behaving as expected for a very good and short example explaining how the Windows command interpreter (CMD.EXE) parses scripts.

It is in general not advisable to assign the string already assigned to a loop variable to an environment variable which is not further modified inside a FOR loop. It would be better to use %%~ni everywhere in your code on which the current file name needs to be referenced.

The usage of delayed expansion requires enabling it with setlocal EnableDelayedExpansion (or with setlocal EnableExtensions EnableDelayedExpansion to enable explicitly also the command extensions enabled by default) as it is not enabled by default in comparison to the command extensions. Then the Windows command processor parses each command line a second time and expands !current_file! on execution of command IF.

But even if NOT !current_file! == "tool.bat" evaluates always to true for the batch file with name tool.bat because of set current_file=%%~ni results in assigned to the environment variable current_file only the string tool (file name without file extension) and the left string is not enclosed in double quotes while the right string is always enclosed in double quotes. The command IF does not remove the double quotes from right string before comparing the two strings.

The batch file in question misses also set unique_files= above the FOR loop to undefine explicitly the environment variable unique_files in case of being already defined by chance on starting the batch file, for example from a previous execution within a command prompt window.

Another problem with the batch file in question is that maximum string length of variable name + equal sign + string assigned to the environment variable is 8191 characters which is a problem on several thousands of file names are concatenated to a long string assigned to one environment variable like unique_files.

I suggest to use this batch file with comments explaining it.

@echo off
setlocal EnableExtensions DisableDelayedExpansion

rem Delete all environment variables of which name starts very unusual
rem with a question mark existing already by chance (with exception of
rem those environment variables with multiple question marks in name).
for /F "delims=?" %%I in ('set ? 2^>nul') do set "?%%I?="

rem Search with the string passed as first argument or simply within current
rem directory recursively for all files and define for each file name an
rem environment variable with a question mark at beginning and one more at
rem end of the variable name. A file name cannot contain a question mark.
rem The value assigned to the environment variable does not matter. As it
rem is not possible to define multiple environment variables with same name
rem and environment variable names are case-insensitive, there is just one
rem environment variable defined on multiple files have same file name.
rem The batch file itself is ignored because of the IF condition.
for /F "delims=" %%I in ('dir "%~1" /A-D /B /S 2^>nul') do if not "%%I" == "%~f0" set "?%%~nI?=1"

rem Initialize the file counting environment variable.
set "FileCount=0"

rem Output all file names which are the environment variable names sorted
rem alphabetically with the question marks removed and additionally count
rem the number of file names output by this loop.
for /F "eol=| delims=?" %%I in ('set ? 2^>nul') do set /A "FileCount+=1" & echo %%I

rem Output finally the number of unique file names excluding file extensions.
echo %FileCount%

rem Restore initial execution environment which results also in the
rem deletion of all environment variables defined during batch execution.
endlocal

It does not use delayed expansion and for that reason works also for file names containing one or more ! in file name which would be processed wrong on enabling delayed expansion on line set current_file=%%~ni because of the exclamation mark(s) in file name would be interpreted as begin/end of a delayed expanded environment variable reference.

There is defined an environment variable for each unique file name. The number of environment variables is limited only by the total available memory for environment variables which is 64 MiB. That should be enough even for several thousands of unique file names in the directory tree.

For understanding the used commands and how they work, open a command prompt window, execute there the following commands, and read entirely all help pages displayed for each command very carefully.

  • call /? ... explains %~f0 which references full name of argument 0 which is the full qualified file name of the currently processed batch file and %~1 referencing first argument with perhaps existing surrounding " removed from argument string.
  • dir /?
  • echo /?
  • endlocal /?
  • for /?
  • if /?
  • rem /?
  • set /?
  • setlocal /?

Read the Microsoft documentation about Using command redirection operators for an explanation of 2>nul. The redirection operator > must be escaped with caret character ^ on the FOR command lines to be interpreted as literal character when Windows command interpreter processes this command line before executing command FOR which executes the embedded dir or set command line in a separate command process started in background with %ComSpec% /c and the command line within ' appended as additional arguments.

Mofi
  • 38,783
  • 14
  • 62
  • 115
  • Part of the filesystem that I'm working on uses two extensions e.g. filename.extension1.extension2. I tried running your solution and it only got rid of the rightmost extension. How would I go about removing both extensions? – Andrew Jacob Dec 01 '20 at 18:52
  • There are not two extensions. The file extension is everything from __last__ dot to end of name of a file. On Linux a file is made hidden by giving it a name starting with a dot. For example `.htaccess` is the file *htaccess* being hidden on Linux. The file `.htaccess` is for Windows a file with no file name and with file extension `.htaccess`. You could replace `set "?%%~nI?=1"` by `for %%J in ("%%~nI") do set "?%%~nJ?=1"` to remove twice from each file name the string from last dot to end of name of file which makes no difference for files with just one dot and something after it. – Mofi Dec 01 '20 at 19:01