10

I am looking for a general-purpose way of defining textual expressions which allow a value to be validated.

For example, I have a value which should only be set to 1, 2, 3, 10, 11, or 12. Its constraint might be defined as: (value >= 1 && value <= 3) || (value >= 10 && value <= 12)

Or another value which can be 1, 3, 5, 7, 9 etc... would have a constraint like value % 2 == 1 or IsOdd(value).

(To help the user correct invalid values, I'd like to show the constraint - so something descriptive like IsOdd is preferable.)

These constraints would be evaluated both on client-side (after user input) and server-side. Therefore a multi-platform solution would be ideal (specifically Win C#/Linux C++).

Is there an existing language/project which allows evaluation or parsing of similar simple expressions?

If not, where might I start creating my own?

I realise this question is somewhat vague as I am not entirely sure what I am after. Searching turned up no results, so even some terms as a starting point would be helpful. I can then update/tag the question accordingly.

Tomas
  • 52,167
  • 46
  • 207
  • 345
g t
  • 6,576
  • 6
  • 44
  • 83
  • 2
    that would be extremely useful to have such a framework which allows validation e.g. in javascript and PHP using just the same rules written in just one language! – Tomas Jan 08 '14 at 14:49

9 Answers9

6

You may want to investigate dependently typed languages like Idris or Agda.

The type system of such languages allows encoding of value constraints in types. Programs that cannot guarantee the constraints will simply not compile. The usual example is that of matrix multiplication, where the dimensions must match. But this is so to speak the "hello world" of dependently typed languages, the type system can do much more for you.

Florian Brucker
  • 7,641
  • 3
  • 37
  • 62
Ingo
  • 34,949
  • 5
  • 49
  • 97
4

If you end up starting your own language I'd try to stay implementation-independent as long as possible. Look for the formal expression grammars of a suitable programming language (e.g. C) and add special keywords/functions as required. Once you have a formal definition of your language, implement a parser using your favourite parser generator.

That way, even if your parser is not portable to a certain platform you at least have a formal standard from where to start a separate parser implementation.

Florian Brucker
  • 7,641
  • 3
  • 37
  • 62
3

You may also want to look at creating a Domain Specific Language (DSL) in Ruby. (Here's a good article on what that means and what it would look like: http://jroller.com/rolsen/entry/building_a_dsl_in_ruby)

This would definitely give you the portability you're looking for, including maybe using IronRuby in your C# environment, and you'd be able to leverage the existing logic and mathematical operations of Ruby. You could then have constraint definition files that looked like this:

constrain 'wakeup_time' do
   6 <= value && value <= 10
end

constrain 'something_else' do
   check (value % 2 == 1), MustBeOdd
end

# constrain is a method that takes one argument and a code block
# check is a function you've defined that takes a two arguments
# MustBeOdd is the name of an exception type you've created in your standard set

But really, the great thing about a DSL is that you have a lot of control over what the constraint files look like.

leoger
  • 1,036
  • 9
  • 15
1

Not sure if it's what you looking for, but judging from your starting conditions (Win C#/Linux C++) you may not need it to be totally language agnostic. You can implement such a parser yourself in C++ with all the desired features and then just use it in both C++ and C# projects - thus also bypassing the need to add external libraries.

On application design level, it would be (relatively) simple - you create a library which is buildable cross-platform and use it in both projects. The interface may be something simple like:

bool VerifyConstraint_int(int value, const char* constraint);
bool VerifyConstraint_double(double value, const char* constraint);
// etc

Such interface will be usable both in Linux C++ (by static or dynamic linking) and in Windows C# (using P/Invoke). You can have same codebase compiling on both platforms.

The parser (again, judging from what you've described in the question) may be pretty simple - a tree holding elements of types Variable and Expression which can be Evaluated with a given Variable value.

Example class definitions:

class Entity {public: virtual VARIANT Evaluate() = 0;} // boost::variant may be used typedef'd as VARIANT
class BinaryOperation: public Entity {
    private:
        Entity& left;
        Entity& right;
        enum Operation {PLUS,MINUS,EQUALS,AND,OR,GREATER_OR_EQUALS,LESS_OR_EQUALS};
    public:
        virtual VARIANT Evaluate() override; // Evaluates left and right operands and combines them
}
class Variable: public Entity {
    private:
        VARIANT value;
    public:
        virtual VARIANT Evaluate() override {return value;};
}

Or, you can just write validation code in C++ and use it both in C# and C++ applications :)

DarkWanderer
  • 8,544
  • 1
  • 23
  • 53
1

there are a number of ways to verify a list of values across multiple languages. My preferred method is to make a list of the permitted values and load them into a dictionary/hashmap/list/vector (dependant on the language and your preference) and write a simple isIn() or isValid() function, that will check that the value supplied is valid based on its presence in the data structure. The beauty of this is that the code is trivial and can be implemented in just about any language very easily. for odd-only or even-only numeric validity again, a small library of different language isOdd() functions will suffice: if it isn't odd it must by definition be even (apart from 0 but then a simple exception can be set up to handle that, or you can simply specify in your code documentation that for logical purposes your code evaluates 0 as odd/even (your choice)).

I normally cart around a set of c++ and c# functions to evaluate isOdd() for similar reasons to what you have alluded to, and the code is as follows:

C++

bool isOdd( int integer ){  return (integer%2==0)?false:true;  }

you can also add inline and/or fastcall to the function depending on need or preference; I tend to use it as an inline and fastcall unless there is a need to do otherwise (huge performance boost on xeon processors).

C#

Beautifully the same line works in C# just add static to the front if it is not going to be part of another class:

static bool isOdd( int integer ){  return (integer%2==0)?false:true;  }

Hope this helps, in any event let me know if you need any further info:)

GMasucci
  • 2,747
  • 17
  • 39
1

My personal choice would be Lua. The downside to any DSL is the learning curve of a new language and how to glue the code with the scripts but I've found Lua has lots of support from the user base and several good books to help you learn.

If you are after making somewhat generic code that a non programmer can inject rules for allowable input it's going to take some upfront work regardless of the route you take. I highly suggest not rolling your own because you'll likely find people wanting more features that an already made DSL will have.

Lambage
  • 227
  • 3
  • 7
1

If you are using Java then you can use the Object Graph Navigation Library.

It enables you to write java applications that can parse,compile and evaluate OGNL expressions.

OGNL expressions include basic java,C,C++,C# expressions.

You can compile an expression that uses some variables, and then evaluate that expression for some given variables.

Dan Brough
  • 2,267
  • 19
  • 23
1

An easy way to achieve validation of expressions is to use Python's eval method. It can be used to evaluate expressions just like the one you wrote. Python's syntax is easy enough to learn for simple expressions and english-like. Your expression example is translated to:

(value >= 1 and value <= 3) or (value >= 10 and value <= 12)

Code evaluation provided by users might pose a security risk though as certain functions could be used to be executed on the host machine (such as the open function, to open a file). But the eval function takes extra arguments to restrict the allowed functions. Hence you can create a safe evaluation environment.

# Import math functions, and we'll use a few of them to create
# a list of safe functions from the math module to be used by eval.
from math import *

# A user-defined method won't be reachable in the evaluation, as long
# as we provide the list of allowed functions and vars to eval.
def dangerous_function(filename):
  print open(filename).read()

# We're building the list of safe functions to use by eval:
safe_list = ['math','acos', 'asin', 'atan', 'atan2', 'ceil', 'cos', 'cosh', 'degrees', 'e', 'exp', 'fabs', 'floor', 'fmod', 'frexp', 'hypot', 'ldexp', 'log', 'log10', 'modf', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan', 'tanh']
safe_dict = dict([ (k, locals().get(k, None)) for k in safe_list ])

# Let's test the eval method with your example:
exp = "(value >= 1 and value <= 3) or (value >= 10 and value <= 12)"
safe_dict['value'] = 2
print "expression evaluation: ", eval(exp, {"__builtins__":None},safe_dict)
-> expression evaluation:  True

# Test with a forbidden method, such as 'abs'
exp = raw_input("type an expression: ")
-> type an expression: (abs(-2) >= 1 and abs(-2) <= 3) or (abs(-2) >= 10 and abs(-2) <= 12)
print "expression evaluation: ", eval(exp, {"__builtins__":None},safe_dict)
-> expression evaluation:
-> Traceback (most recent call last):
->   File "<stdin>", line 1, in <module>
->   File "<string>", line 1, in <module>
-> NameError: name 'abs' is not defined

# Let's test it again, without any extra parameters to the eval method
# that would prevent its execution
print "expression evaluation: ", eval(exp)
-> expression evaluation:  True 
# Works fine without the safe dict! So the restrictions were active 
# in the previous example..

# is odd?
def isodd(x): return bool(x & 1)
safe_dict['isodd'] = isodd
print "expression evaluation: ", eval("isodd(7)", {"__builtins__":None},safe_dict)
-> expression evaluation:  True
print "expression evaluation: ", eval("isodd(42)", {"__builtins__":None},safe_dict)
-> expression evaluation:  False

# A bit more complex this time, let's ask the user a function:
user_func = raw_input("type a function: y = ")
-> type a function: y = exp(x)

# Let's test it:
for x in range(1,10):
    # add x in the safe dict
    safe_dict['x']=x
    print "x = ", x , ", y = ", eval(user_func,{"__builtins__":None},safe_dict)

-> x =  1 , y =  2.71828182846
-> x =  2 , y =  7.38905609893
-> x =  3 , y =  20.0855369232
-> x =  4 , y =  54.5981500331
-> x =  5 , y =  148.413159103
-> x =  6 , y =  403.428793493
-> x =  7 , y =  1096.63315843
-> x =  8 , y =  2980.95798704
-> x =  9 , y =  8103.08392758

So you can control the allowed functions that should be used by the eval method, and have a sandbox environment that can evaluate expressions.

This is what we used in a previous project I worked in. We used Python expressions in custom Eclipse IDE plug-ins, using Jython to run in the JVM. You could do the same with IronPython to run in the CLR.

The examples I used in part inspired / copied from the Lybniz project explanation on how to run a safe Python eval environment. Read it for more details!

jlr
  • 928
  • 6
  • 16
0

You might want to look at Regular-Expressions or RegEx. It's proven and been around for a long time. There's a regex library all the major programming/script languages out there.

Libraries:

Usage

Community
  • 1
  • 1
Dannie
  • 2,200
  • 12
  • 16