11

I'm trying to implement Unification, but having problems.. already got dozen of examples,but all they do is to muddy the water. I get more confused than enlightened :

http://www.cs.trincoll.edu/~ram/cpsc352/notes/unification.html

https://www.doc.ic.ac.uk/~sgc/teaching/pre2012/v231/lecture8.html [code below is based on this intro]

http://www.cs.bham.ac.uk/research/projects/poplog/paradigms_lectures/lecture20.html#representing

https://norvig.com/unify-bug.pdf

How can I implement the unification algorithm in a language like Java or C#?

The art of Prolog one ...and several others. The biggest problem is that I could not figure clear statement of the problem. More mathy or lispy explanations are confusing me even more.

As a good start it seems a good idea to follow the representation to be list based (like in the lispy cases) i.e. :

pred(Var, val)  =becomes=> [pred, Var, val] 
p1(val1, p2(val2, Var1)) ==> [p1, val1, [p2, val2, Var1]]

except how do you represent lists themselves !? i.e. [H|T]

I would love if you can show me a Python pseudo code and/or more detailed algorithm description or a pointer to one.

Some points I grasp is the need to separate the code in general-unifier and var-unification, but then I cant see the mutual-recusive case ! ... and so on.


As a side note : I would also love for you to mention how would you handle Unification on Backtracking. I think I have backtracking squared-away, but I know something has to happen to substitution-frame on backtracking.


Added an answer with the current code.

http://www.igrok.site/bi/Bi_language.html

http://www.igrok.site/bi/TOC.html

https://github.com/vsraptor/bi/blob/master/lib/bi_engine.py

sten
  • 5,313
  • 6
  • 33
  • 43
  • 2
    A Prolog list is just syntactic sugar for the list constructor ```.``` and the empty list ```[]```. The list ```[a,b,c]``` is equivalent to ```[a | [b | [c | [] ]```. The list constructor ```|``` is itself syntactic sugar for a binary function symbol ```.```. Internally the list ```[a,b,c]``` looks like ```.(a, .(b, .(c, [])))```. On a conceptual level, a Prolog list does not unify differently than other terms. Btw, in Lisp the constructor ```.``` is called ```cons``` and the empty list ```[]``` is called ```NIL```. – lambda.xy.x Mar 05 '18 at 15:58
  • 2
    to `[a | [b | [c | [] ]]]` – false Mar 07 '18 at 23:34
  • Of interest: [The design and implementation of a PROLOG interpreter](https://preserve.lehigh.edu/cgi/viewcontent.cgi?article=5506&context=etd) While this is done in Pascal, many of the concepts given as examples are still relevant for any way of implementing Prolog. – Guy Coder Aug 17 '20 at 13:40

3 Answers3

15

I will quickly summarize the chapter about Unification Theory by Baader and Snyder from the Handbook of Automated Reasoning:

Terms are built from constants (starting with a lower case letter) and variables (starting with an upper case letter):

  • a constant without arguments is a term: e.g. car
  • a constant with terms as arguments, a so called function application, is a term. e.g. date(1,10,2000)
  • a variable is a term, e.g. Date (variables never have arguments)

A substitution is a map assigning terms to variables. In the literature, this is often written as {f(Y)/X, g(X)/Y} or with arrows {X→f(Y), Y→g(X)}. Applying a substitution to a term replaces each variable by the corresponding term in the list. E.g. the substitution above applied to tuple(X,Y) results in the term tuple(f(Y),g(X)).

Given two terms s and t, a unifier is a substitution that makes s and t equal. E.g. if we apply the substitution {a/X, a/Y} to the term date(X,1,2000), we get date(a,1,2000) and if we apply it to date(Y,1,2000) we also get date(a,1,2000). In other words, the (syntactic) equality date(X,1,2000) = date(Y,1,2000) can be solved by applying the unifier {a/X,a/Y}. Another, simpler unifier would be X/Y. The simplest such unifier is called the most general unifier. For our purposes it's enough to know that we can restrict ourselves to the search of such a most general unifier and that, if it exists, it is unique (up to the names of some variables).

Mortelli and Montanari (see section 2.2. of the article and the references there) gave a set of rules to compute such a most general unifier, if it exists. The input is a set of term pairs (e.g. { f(X,b) = f(a,Y), X = Y } ) and the output is a most general unifier, if it exists or failure if it does not exist. In the example, the substitution {a/X, b/Y} would make the first pair equal (f(a,b) = f(a,b)), but then the second pair would be different (a = b is not true).

The algorithm nondeterministically picks one equality from the set and applies one of the following rules to it:

  • Trivial: an equation s = s (or X=X) is already equal and can be safely removed.
  • Decomposition: an equality f(u,v) = f(s,t) is replaced by the equalities u=s and v=t.
  • Symbol Clash: an equality a=b or f(X) = g(X) terminates the process with failure.
  • Orient: an equality of the form t=X where t is not another variable is flipped to X=t such that the variable is on the left side.
  • Occurs check: if the equation is of the form X=t, t is not X itself and if X occurs somewhere within t, we fail. [1]
  • Variable elimination: of we have an equation X=t where X does not occur in t, we can apply the substitution t/X to all other problems.

When there is no rule left to apply, we end up with a set of equations {X=s, Y=t, ...} that represents the substitution to apply.

Here are some more examples:

  • {f(a,X) = f(Y,b)} is unifiable: decompose to get {a=Y, X=b} and flip to get {Y=a, X=b}
  • {f(a,X,X) = f(a,a,b)} is not unifiable: decompose to get {a=a,X=a, X=b}, eliminate a=a by triviality, then eliminate the variable X to get {a=b} and fail with symbol clash
  • {f(X,X) = f(Y,g(Y))} is not unifiable: decompose to get {X=Y, X=g(Y)}, eliminate the variable X to get {Y=g(Y)}, fail with occurs check

Even though the algorithm is non-deterministic (because we need to pick a equality to work on), the order does not matter. Because you can commit to any order, it is never necessary to undo your work and try a different equation instead. This technique is usually called backtracking and is necessary for the proof search in Prolog, but not for unification itself.

Now you're only left to pick a suitable data-structure for terms and substitutions and implement the algorithms for applying a substitution to a term as well as the rule based unification algorithm.

[1] If we try to solve X = f(X), we would see that X needs to be of the form f(Y) to apply decomposition. That leads to solving the problem f(Y) = f(f(Y)) and subsequently Y = f(Y). Since the left hand side always has one application of f less than the right hand side, they can not be equal as long we see a term as a finite structure.

lambda.xy.x
  • 4,406
  • 21
  • 33
  • 2
    I have to agree that the chapter on Unification is one of the most authoritative references on Unification if not the most authoritative, but as one who has implemented the unification algorithm in languages such as OCaml, F#, etc, and seen it implemented in languages such as C, C#, etc. That is one of the [denses](https://en.oxforddictionaries.com/definition/dense) reads on the topic. – Guy Coder Mar 05 '18 at 16:07
  • 1
    At least in the context of Prolog, there are no binders to take care of and the whole business with variable representation is simplified a lot. – lambda.xy.x Mar 05 '18 at 17:25
  • 1
    It's also hard not to get into too many details - a lot of the techniques developed solve problems specific of the use (e.g. compare the data-structures in interactive and automated theorem proving).] – lambda.xy.x Mar 05 '18 at 17:27
  • thanks this clarifies the fog a little bit .. very good intro – sten Mar 05 '18 at 21:55
  • 2
    (Any reason you use triple backquotes when a single one would be enough?) – false Mar 07 '18 at 23:32
  • Not a good one: I believed they were necessary. I'll remove them and use single ones in the future :-) – lambda.xy.x Mar 08 '18 at 17:42
  • Great answer. The only problem I have is understanding why occurs check is correct. In your example ``{Y=g(Y)}``, why is it a failure in general? If ``g`` is the identity function, would it not be viable? I guess my problem arises from the question, which kind of equality is applied here. (Value vs Instance equality?) – BitTickler Apr 05 '19 at 08:52
  • Suppose we assign `Y` a size n, say the number of nodes in the term tree (e.g. `f(g(a),b)` has size 4). Then the term `g(Y)` has size n+1. Now when we unify the two terms, they should have the same sizes which amounts to solving the equation `n = n+1`. There is no term of finite size that satisfies it, that's why we usually forbid it in first order logic / theorem proving. – lambda.xy.x Apr 05 '19 at 20:03
  • The Prolog unification actually allows terms that contain themselves - they are called cyclic terms. We can define an infinite list [1,2,1,2,...] by writing it as `Xs = [1,2 | Xs]` (some functional languages call them lazy lists). As long as we only look at finite prefixes of `Xs`, everything works fine. But it's clear that this list does not have a finite length and cannot have anything appended, so you would need to build your own library around cyclic terms. – lambda.xy.x Apr 05 '19 at 20:07
  • Ah sorry, I didn't read your comment properly. Yes, there is a function that satisfies `Y = g(Y)` (identity, as you correctly said), but `g` could be any function, not only identity, so in general, we cannot assume that. – lambda.xy.x Apr 05 '19 at 20:12
7

I get more confused than enlightened

Been there, done that.

Note: For any source code referenced I did not test the code and can not say it is valid, they are given as an example and look correct enough that I would load them up and run test cases against them to determine their validity.

First: You will get much better search results if you use the correct terminology, use backward chaining instead of Backtracking. e.g. backward-chaining/inference.py

Second: Understand that your question has three separate phases listed.
1. Unification algorithm
2. Backward chaining that uses Unification
3. A data structure for a list. You would not implement this as Python source code but as text to be passed to your functions. See: cons

You should first develop and fully test unification before moving onto backward chaining. Then fully develop and test backward chaining before creating a list data structure. Then fully test your list data structure.

Third: There is more than one way to implement the unification algorithm.
a. You noted the one that uses transformation rules, or noted as A rule based approach in Unification Theory by Baader and Snyder, e.g. delete decompose etc.
b. I prefer the algorithm noted as Unification by recursive descent in Unification Theory by Baader and Snyder given in this OCaml example or Python example
c. I have seen some that use permutations but can't find a good reference at present.

Fourth: From personal experience, understand how each phase works by using pen and paper first, then implement it in code.

Fifth: Again from personal experience, there is lots of information out there on how to do this but the math and technical papers can be confusing as many gloss over something critical to a self-learner or are too dense. I would suggest that instead you focus on finding implementations of the source code/data structures and use that to learn.

Sixth: compare your results against actual working code, e.g. SWI-Prolog.

I can't stress enough how much you need to test each phase before moving on to the next and make sure you have a complete set of test cases.

When I wanted to learn how to write this in a functional language the books on AI 1 2 3 and The Programming Languages Zoo were invaluable. Had to install environments for Lisp and OCaml but was worth the effort.

repeat
  • 19,449
  • 4
  • 51
  • 152
Guy Coder
  • 22,011
  • 6
  • 54
  • 113
  • thanks.. have to reread couple of time.. i already started what you were saying.. building test suite.. fixed couple of errs and my current variant (not posted yet) works on every case i can come up with. u right comparing with SWI helps. – sten Mar 05 '18 at 20:55
  • I meant backtracking i.e. I'm also interested how do you unwind Unification on backtracking, when a term in chain of reasoning fails! – sten Mar 05 '18 at 22:07
  • when a goal fails isn't its supposed to free all bindings that the goal initialized !? – sten Mar 05 '18 at 23:21
  • 1
    Is it possible that the algorithm with permutations is one using [Explicit Substitutions](https://en.wikipedia.org/wiki/Explicit_substitution) / the λσ calculus? – lambda.xy.x Mar 06 '18 at 00:52
  • 3
    For using backtracking, be aware that you are acting on a different part of the implementation. Suppose you have a rule ```p(X) :- q(X).``` and a goal ```p(a)```. By unification, you will find out that you have to apply the unifier ```a/X```, instantiate the rule to ```p(a) :- q(a).``` and make ```q(a)``` the new goal. In case this path fails, you might follow a different rule where the head unifies. The classical data-structure to save this undo information is a stack (or an accumulator in functional programming). – lambda.xy.x Mar 06 '18 at 00:58
0

This so far works for all cases I come up with (except one case which requires occurs check, which I have not done yet):

def unify_var(self, var, val, subst):
#   print "var> ", var, val, subst

    if var in subst :   
        return self.unify(subst[var], val, subst)
    elif isinstance(val, str) and val in subst : 
        return self.unify(var, subst[val], subst)
    #elif (var occurs anywhere in x) then return failure
    else :
        #print "%s := %s" % (var, val)
        subst[var] = val ; return subst

def unify(self, sym1, sym2, subst):
    #print 'unify>', sym1, sym2, subst

    if subst is False : return False
    #when both symbols match
    elif isinstance(sym1, str) and isinstance(sym2, str) and sym1 == sym2 : return subst
    #variable cases
    elif isinstance(sym1, str) and is_var(sym1) : return self.unify_var(sym1, sym2, subst)
    elif isinstance(sym2, str) and is_var(sym2) : return self.unify_var(sym2, sym1, subst)
    elif isinstance(sym1, tuple) and isinstance(sym2, tuple) : #predicate case
        if len(sym1) == 0 and len(sym2) == 0 : return subst
        #Functors of structures have to match
        if isinstance(sym1[0], str) and  isinstance(sym2[0],str) and not (is_var(sym1[0]) or is_var(sym2[0])) and sym1[0] != sym2[0] : return False
        return self.unify(sym1[1:],sym2[1:], self.unify(sym1[0], sym2[0], subst))
    elif isinstance(sym1, list) and isinstance(sym2, list) : #list-case
        if len(sym1) == 0 and len(sym2) == 0 : return subst
        return self.unify(sym1[1:],sym2[1:], self.unify(sym1[0], sym2[0], subst))

    else: return False

FAIL cases are supposed to fail :

OK: a <=> a : {}
OK: X <=> a : {'X': 'a'}
OK: ['a'] <=> ['a'] : {}
OK: ['X'] <=> ['a'] : {'X': 'a'}
OK: ['a'] <=> ['X'] : {'X': 'a'}
OK: ['X'] <=> ['X'] : {}
OK: ['X'] <=> ['Z'] : {'X': 'Z'}
OK: ['p', 'a'] <=> ['p', 'a'] : {}
OK: ['p', 'X'] <=> ['p', 'a'] : {'X': 'a'}
OK: ['p', 'X'] <=> ['p', 'X'] : {}
OK: ['p', 'X'] <=> ['p', 'Z'] : {'X': 'Z'}
OK: ['X', 'X'] <=> ['p', 'X'] : {'X': 'p'}
OK: ['p', 'X', 'Y'] <=> ['p', 'Y', 'X'] : {'X': 'Y'}
OK: ['p', 'X', 'Y', 'a'] <=> ['p', 'Y', 'X', 'X'] : {'Y': 'a', 'X': 'Y'}
 ================= STRUCT cases ===================
OK: ['e', 'X', ('p', 'a')] <=> ['e', 'Y', ('p', 'a')] : {'X': 'Y'}
OK: ['e', 'X', ('p', 'a')] <=> ['e', 'Y', ('p', 'Z')] : {'X': 'Y', 'Z': 'a'}
OK: ['e', 'X', ('p', 'a')] <=> ['e', 'Y', ('P', 'Z')] : {'X': 'Y', 'Z': 'a', 'P': 'p'}
OK: [('p', 'a', 'X')] <=> [('p', 'Y', 'b')] : {'Y': 'a', 'X': 'b'}
OK: ['X', 'Y'] <=> [('p', 'a'), 'X'] : {'Y': ('p', 'a'), 'X': ('p', 'a')}
OK: [('p', 'a')] <=> ['X'] : {'X': ('p', 'a')}
-----
FAIL: ['e', 'X', ('p1', 'a')] <=> ['e', 'Y', ('p2', 'Z')] : False
FAIL: ['e', 'X', ('p1', 'a')] <=> ['e', 'Y', ('p1', 'b')] : False
FAIL: [('p', 'a', 'X', 'X')] <=> [('p', 'a', 'a', 'b')] : False
(should fail, occurs) OK: [('p1', 'X', 'X')] <=> [('p1', 'Y', ('p2', 'Y'))] : {'Y': ('p2', 'Y'), 'X': 'Y'}
================= LIST cases ===================
OK: ['e', 'X', ['e', 'a']] <=> ['e', 'Y', ['e', 'a']] : {'X': 'Y'}
OK: ['e', 'X', ['a', 'a']] <=> ['e', 'Y', ['a', 'Z']] : {'X': 'Y', 'Z': 'a'}
OK: ['e', 'X', ['e', 'a']] <=> ['e', 'Y', ['E', 'Z']] : {'X': 'Y', 'Z': 'a', 'E': 'e'}
OK: ['e', 'X', ['e1', 'a']] <=> ['e', 'Y', ['e1', 'a']] : {'X': 'Y'}
OK: [['e', 'a']] <=> ['X'] : {'X': ['e', 'a']}
OK: ['X'] <=> [['e', 'a']] : {'X': ['e', 'a']}
================= FAIL cases ===================
FAIL: ['a'] <=> ['b'] : False
FAIL: ['p', 'a'] <=> ['p', 'b'] : False
FAIL: ['X', 'X'] <=> ['p', 'b'] : False
sten
  • 5,313
  • 6
  • 33
  • 43
  • 2
    If you are copying code from somewhere else or your code is very close to code you copied then you need to reference it so that people know. Also this is not really an answer and should be included in your question as an update. – Guy Coder Mar 05 '18 at 23:19