4

I know that when comparing stuff for equality in a batch file it's common to enclose both sides in quotes, like

IF "%myvar% NEQ "0" 

But when comparing using "greater than" or "less than", this doesn't work because the operands would then be treated as strings with quotes around them. So you can instead just do

IF %myvar% GTR 20000

The caveat is that if the variable %myvar% isn't declared, it would be like doing

IF GTR 20000

which is a syntax error.

I came up with the following workaround:

IF 1%myvar% GTR 120000

which I'm hoping would result in IF 1 GTR 120000 if myvar is undefined, and it seems to work.

Is this a safe way to compare numbers and accounting for undeclared variables, or did I just open up a whole new can of caveats?

Magnus W
  • 11,902
  • 10
  • 61
  • 135
  • 2
    It would be prudent, if you do not have total control of the value data, to ensure that any, now unprotected, poison characters are non existent. My general advice would be to verify that a variable is defined before working with it, `If [Not] Defined MyVar ...`, but that's mainly for readability. – Compo Jul 19 '19 at 08:10
  • 1
    To be sure, I'd re-assign the variable(s) accordingly before comparing them: `set /a myvar=myvar` or `set /a myvar*=1. If the variable is not defined or not a number, `set /a` will set the variable to `0`. – Stephan Jul 19 '19 at 08:11
  • 1
    @Stephan : Oh - beware of that! `set /a avar=%var%` will leave `avar` unchanged if `var` is invalid (non-octal) and has a leading 0.(sets `errorlevel` though) – Magoo Jul 19 '19 at 08:30
  • 1
    @Magoo: like `set var=0z`? Correct, `set /a newvar=%var%` gives an error and leaves `newvar` unchanged. But try `set /a var=var` (without `%`!) or `set /a var*=1`. Or maybe even `set /a newvar=var` – Stephan Jul 19 '19 at 08:47
  • 1
    To assure `myvar` is defined and contains only numbers you could use `if defined myvar Echo:%myvar%|findstr /BE "[0123456789]*" >NUl 2>&1 &&(Echo Only numbers)||(Echo invalid chars)` Note: using a range `[0-9]` with findstr would also include superscripts `²³`. –  Jul 19 '19 at 08:55
  • @LotPings, what about negative or hexadecimal numbers? – aschipfl Jul 19 '19 at 11:25
  • I usually use `set /A "var+=0"` for this. Something like `set /A "var=%var%` is not safe and may throw nasty error messages. Preceding a `1` is bad as `01` equals `001`, but `101` and `1001` are not, and negative or hex. numbers also fail then... – aschipfl Jul 19 '19 at 11:28

2 Answers2

4

Let us assume the batch file contains:

@echo off
:PromptUser
rem Undefine environment variable MyVar in case of being already defined by chance.
set "MyVar="
rem Prompt user for a positive number in range 0 to 20000.
set /P "MyVar=Enter number [0,20000]: "

As I explained by my answer on How to stop Windows command interpreter from quitting batch file execution on an incorrect user input? the user has the freedom to enter really anything including a string which could easily result in breaking batch file execution because of a syntax error or resulting in doing something the batch file is not written for.


1. User entered nothing

If the user hits just key RETURN or ENTER, the environment variable MyVar is not modified at all by command SET. It is easy to verify in this case with environment variable MyVar explicitly undefined before prompting the user if the user entered a string at all with:

if not defined MyVar goto PromptUser

Note: It is possible to use something different than set "MyVar=" like set "MyVar=1000" to define a default value which can be even output on prompt giving the user the possibility to just hit RETURN or ENTER to use the default value.

2. User entered a string with one or more "

The user could enter a string with one or more " intentionally or by mistake. For example pressing on a German keyboard key 2 on non-numeric keyboard with CapsLock currently enabled results in entering ", except German (IBM) is used on which CapsLock is by software only active for the letters. So if the user hits 2 and RETURN quickly or without looking on screen as many people do on typing on keyboard, a double quote character instead of 2 was entered by mistake by the user.

On MyVar holding a string with one or more " all %MyVar% or "%MyVar%" environment variable references are problematic because of %MyVar% is replaced by Windows command processor by user input string with one or more " which nearly always results in a syntax error or the batch file does something it was not designed for. See also How does the Windows Command Interpreter (CMD.EXE) parse scripts?

There are two solutions:

  1. Enable delayed expansion and reference the environment variable using !MyVar! or "!MyVar!" as now the user input string does not affect anymore the command line executed by cmd.exe after parsing it.
  2. Remove all " from user input string if this string should never contain a double quote character.

Character " is definitely invalid in a string which should be a number in range 0 to 20000 (decimal). For that reason two more lines can be used to prevent wrong processing of user input string caused by ".

set "MyVar=%MyVar:"=%"
if not defined MyVar goto PromptUser

The Windows command processor removes all doubles quotes already on parsing this line before replacing %MyVar:"=% with the resulting string. Therefore the finally executed command line set "MyVar=whatever was entered by the user" is safe on execution.

The example above with a by mistake entered " instead of 2 results in execution of set "MyVar=" which undefines the environment variable MyVar which is the reason why the IF condition as used before must be used again before further processing of the user input.

3. User entered non-valid character(s)

The user should enter a positive decimal number in range 0 to 20000. So any other character than 0123456789 in user input string is definitely invalid. Checking for any invalid character can be done for example with:

for /F delims^=0123456789^ eol^= %%I in ("%MyVar%") do goto PromptUser

The command FOR does not execute goto PromptUser if the entire string consists of just digits. In all other cases including a string starting with ; after zero or more digits results in execution of goto PromptUser because of input string contains a non-digit character.

4. User entered number with leading 0

Windows command processor interprets numbers with a leading 0 as octal numbers. But the number should be interpreted as decimal number even on user input it with one or more 0 at beginning. For that reason the leading zero(s) should be removed before further processing variable value.

for /F "tokens=* delims=0" %%I in ("%MyVar%") do set "MyVar=%%I"
if not defined MyVar set "MyVar=0"

FOR removes all 0 at beginning of string assigned to MyVar and assigns to loop variable I the remaining string which is assigned next to environment variable MyVar.

FOR runs in this case set "MyVar=%%I" even on user entered 0 or 000 with the result of executing set "MyVar=" which undefines environment variable MyVar in this special case. But 0 is a valid number and therefore the IF condition is necessary to redefine MyVar with string value 0 on user entered number 0 with one or more zeros.

5. User entered too large number

Now it is safe to use the command IF with operator GTR to validate if the user entered a too large number.

if %MyVar% GTR 20000 goto PromptUser

This last verification works even on user entering 82378488758723872198735897 which is larger than maximum positive 32 bit integer value 2147483647 because of the range overflow results in using 2147483647 on execution of this IF condition. See my answer on weird results with IF for details.


6. Possible solution 1

An entire batch file for safe evaluation of user input number in range 0 to 20000 for only decimal numbers is:

@echo off
set "MinValue=0"
set "MaxValue=20000"

:PromptUser
rem Undefine environment variable MyVar in case of being already defined by chance.
set "MyVar="
rem Prompt user for a positive number in range %MinValue% to %MaxValue%.
set /P "MyVar=Enter number [%MinValue%,%MaxValue%]: "

if not defined MyVar goto PromptUser
set "MyVar=%MyVar:"=%"
if not defined MyVar goto PromptUser
for /F delims^=0123456789^ eol^= %%I in ("%MyVar%") do goto PromptUser
for /F "tokens=* delims=0" %%I in ("%MyVar%") do set "MyVar=%%I"
if not defined MyVar set "MyVar=0"
if %MyVar% GTR %MaxValue% goto PromptUser
rem if %MyVar% LSS %MinValue% goto PromptUser

rem Output value of environment variable MyVar for visual verification.
set MyVar
pause

This solution gives the batch file writer also the possibility to output an error message informing the user why the input string was not accepted by the batch file.

The last IF condition with operator LSS is not needed if MinValue has value 0 which is the reason why it is commented out with command REM for this use case.


7. Possible solution 2

Here is one more safe solution which has the disadvantage that the user cannot enter a decimal number with one or more leading 0 being nevertheless interpreted decimal as expected usually by users.

@echo off
set "MinValue=0"
set "MaxValue=20000"

:PromptUser
rem Undefine environment variable MyVar in case of being already defined by chance.
set "MyVar="
rem Prompt user for a positive number in range %MinValue% to %MaxValue%.
set /P "MyVar=Enter number [%MinValue%,%MaxValue%]: "

if not defined MyVar goto PromptUser
setlocal EnableDelayedExpansion
set /A "Number=MyVar" 2>nul
if not "!Number!" == "!MyVar!" endlocal & goto PromptUser
endlocal
if %MyVar% GTR %MaxValue% goto PromptUser
if %MyVar% LSS %MinValue% goto PromptUser

rem Output value of environment variable MyVar for visual verification.
set MyVar
pause

This solution uses delayed environment variable expansion as written as first option on point 2 above.

An arithmetic expression is used to convert the user input string to a signed 32 bit integer interpreting the string as decimal, octal or hexadecimal number and back to a string assigned to environment variable Number on which decimal numeral system is used by Windows command processor. An error output on evaluation of the arithmetic expression because of an invalid user string is redirected to device NUL to suppress it.

Next is verified with using delayed expansion if the number string created by the arithmetic expression is not identical to the string entered by the user. This IF condition is true on invalid user input including number having leading zeros interpreted octal by cmd.exe or a number entered hexadecimal like 0x14 or 0xe3.

On passing the string comparison it is safe to compare value of MyVar with 20000 and 0 using the operators GTR and LSS.

Please read this answer for details about the commands SETLOCAL and ENDLOCAL because there is much more done on running setlocal EnableDelayedExpansion and endlocal than just enabling and disabling delayed environment variable expansion.


8. Possible solution 3

There is one more solution using less command lines if the value 0 is out of valid range, i.e. the number to enter by the user must be greater 0.

@echo off
set "MinValue=1"
set "MaxValue=20000"

:PromptUser
rem Undefine environment variable MyVar in case of being already defined by chance.
set "MyVar="
rem Prompt user for a positive number in range %MinValue% to %MaxValue%.
set /P "MyVar=Enter number [%MinValue%,%MaxValue%]: "
set /A MyVar+=0
if %MyVar% GTR %MaxValue% goto PromptUser
if %MyVar% LSS %MinValue% goto PromptUser

rem Output value of environment variable MyVar for visual verification.
set MyVar
pause

This code uses set /A MyVar+=0 to convert the user entered string to a 32-bit signed integer value and back to a string as suggested by aschipfl in his comment above.

The value of MyVar is 0 after command line with the arithmetic expression if the user did not input any string at all. It is also 0 if the user input string has as first character not one of these characters -+0123456789 like " or / or (.

A user input string starting with a digit, or - or + and next character is a digit, is converted to an integer value and back to a string value. The entered string can be a decimal number or an octal number or a hexadecimal number. Please take a look on my answer on Symbol equivalent to NEQ, LSS, GTR, etc. in Windows batch files which explains in detail how Windows command processor converts a string to an integer value.

The disadvantage of this code is that a by mistake input string like 7"( instead of 728 caused by holding Shift on pressing the keys 2 and ( on a German keyboard is not detected by this code. MyVar has value 7 on user enters by mistake 7"(. Windows command processor interprets just the characters up to first not valid character for a decimal, hexadecimal or octal number as integer value and ignores the rest of the string.

The batch file using this code is safe against an unwanted exit of batch file processing because of a syntax error never occurs independent on what the user inputs. But a by mistake wrong input number is in some cases not detected by the code resulting in processing the batch file further with a number which the user did not want to use.

Mofi
  • 38,783
  • 14
  • 62
  • 115
1

Answering the call to nitpick

Mofi has been requesting I write my own solution here, that is "shorter" as I pointed out to him the way he wrote his code using & instead of ( followed by a command then a carriage return and another command, or `( followed by a carriage return, followed by another command followed by a carriage return followed by another command) sets a precedent which makes this a hard task to agree on.

I also did not think this was the POINT of providing the answers perse, I mean I used to, but when changes are minor, and mainly fixing logic, or offering a minorly different solution, is that really a big difference? Does that really warrant being a separate answer?

That said, I don't see a better way without editing his response.. but this still leaves unresolved questions on what is being judged shorter.

Unfortunately as well, in discussing with Mofi he has edited his answer to one that can result in invalid choices.

While I have pointed this out, and I'm sure this was just a minor oversite on his part, I feel like not posting the code here has contributed to him actively deteriorating the quality of his question, which is always a possible outcome when nitpicking.

while Mofi was the driving force in that activity, I don't like the effect it's had on him as I was trying to avoid exactly this effect on my code by not getting into it, so I have decided to post the code comparison to bring some closure for them.

Please not, I will post his original code (the most recent one that did not use the erroneous method), and then refactored to how I would write it, and I will post my Original code, and then refactored to how I believe he would write it (may not be in that order but I will call out each)

So below is the result

Mofi Original:

This is hard to say if you should count every line, there are some instances where & is used to queue up commands and the IFS never use Parenthesis which I wouldn't generally do.

@echo off
set "MinValue=0"
set "MaxValue=20000"

:PromptUser
rem Undefine environment variable MyVar in case of being already defined by chance.
set "MyVar="
rem Prompt user for a positive number in range %MinValue% to %MaxValue%.
set /P "MyVar=Enter number [%MinValue%,%MaxValue%]: "

if not defined MyVar goto PromptUser
setlocal EnableDelayedExpansion
set /A "Number=MyVar" 2>nul
if not "!Number!" == "!MyVar!" endlocal & goto PromptUser
endlocal
if %MyVar% GTR %MaxValue% goto PromptUser
if %MyVar% LSS %MinValue% goto PromptUser

rem Output value of environment variable MyVar for visual verification.
set MyVar
pause

My Code Refactored to Mofi's Form

@ECHO OFF
SETLOCAL EnableDelayedExpansion
SET /A "_Min=-1","_Max=20000"
:Menu
  CLS
  SET "_Input="
  REM Prompt user for a positive number in range %_Min% to %_Max%.
  SET /P "_Input=Enter number [%_Min%,%_Max%]: "
  SET /A "_Tmp=%_input%" && if /I "!_input!" EQU "!_Tmp!" if !_Input! GEQ %_Min% if !_Input! LEQ %_Max% SET _Input & pause & GOTO :EOF 
GOTO :Menu

Mofi's Code Refactored

Mofi's above code Refactored to my more compacted form Where ( have the first command follow except when used on an IF statement, and ) follow the last command. This also makes the entire portion that really does the validation EASY to discern, it is only the portion within the :PromtUser function, not counting REM lines or blank lines this is 13 lines of code.

@(SETLOCAL
  echo off
  SET /A "MinValue=0","MaxValue=20000")

CALL :Main

( ENDLOCAL
  EXIT /B )

:Main
  CALL :PromptUser MyVar
  REM Output value of environment variable MyVar for visual verIFication.
  SET MyVar
  PAUSE
GOTO :EOF


:PromptUser
  SET "MyVar="
  rem Prompt user for a positive number in range %MinValue% to %MaxValue%.
  SET /P "MyVar=Enter number [%MinValue%,%MaxValue%]: "
  
  IF NOT DEFINED MyVar GOTO :PromptUser
  Setlocal EnableDelayedExpansion
  SET /A "Number=MyVar" 2>nul
  
  IF not "!Number!" == "!MyVar!" (
    Endlocal
    GOTO :PromptUser  )
  Endlocal
  IF %MyVar% GTR %MaxValue% (
    GOTO :PromptUser  )
  IF %MyVar% LSS %MinValue% (
    GOTO :PromptUser )
GOTO :EOF

My Code in My Compact Form

To compare here is my code also in the same compact form I refactored Mofi's code to above. Again, only the lines inside of the function itself are "doing the heavy lifting" here and need compare. I did forget that when I worked on my code originally I was trying to match Mofi's form, and it allowed me an extra nicety in keeping my && ( in the following line or all as a single line. So I will post two varients

@(SETLOCAL ENABLEDELAYEDEXPANSION
  ECHO OFF
  SET /A "_Min=-1","_Max=20000" )

CALL :Main

( ENDLOCAL
  EXIT /B )

:Main
  CALL :Menu _input
  REM Output value of environment variable _input for visual verIFication.
  SET _input
  PAUSE
GOTO :EOF


:Menu
  CLS
  SET "_input="
  REM Prompt user for a positive number in range %_Min% to %_Max%. Store it in "_input"
  SET /P "_Input=Enter number [%_Min%,%_Max%]: "
  SET /A "_Tmp=%_input%" && (
    IF /I "!_input!" EQU "!_Tmp!" IF !_Input! GEQ %_Min% IF !_Input! LEQ %_Max% GOTO :EOF )
GOTO :Menu

My Code in My Compact Form 2

@(SETLOCAL ENABLEDELAYEDEXPANSION
  ECHO OFF
  SET /A "_Min=-1","_Max=20000" )

CALL :Main

( ENDLOCAL
  EXIT /B )

:Main
  CALL :Menu
  REM Output value of environment variable _input for visual verification.
  SET _input
  PAUSE
GOTO :EOF


:Menu
  CLS
  SET "_input="
  REM Prompt user for a positive number in range %_Min% to %_Max%. Store it in "_input"
  SET /P "_Input=Enter number [%_Min%,%_Max%]: "
  SET /A "_Tmp=%_input%" || GOTO :Menu 
  IF /I "!_input!" EQU "!_Tmp!" (
    IF !_Input! GEQ %_Min% (
      IF !_Input! LEQ %_Max% (
        GOTO :EOF ) ) )
GOTO :Menu
Community
  • 1
  • 1
Ben Personick
  • 2,666
  • 1
  • 16
  • 22
  • Many thanks for this answer. I really like to see how other programmers would write code written by me and how other programmers solve a task different to me with exactly the same result taking as much use cases into account as possible with 100% syntactically correct written code. That is really great. – Mofi Jan 11 '20 at 13:05
  • Well, I would not use operator `EQU` for comparing two strings like in `"!_input!" EQU "!_Tmp!"`. I would use the string comparison operator `==` for this __IF__ condition. But everything else in __Mofi's Code Refactored__ is very well written. I like also the __compact form__ variants. Many thanks once again for this answer. – Mofi Jan 11 '20 at 13:13