22

RDBMS are based on Relational Algebra as well as Codd's Model. Do we have something similar to that for Programming languages or OOP?

Padmarag
  • 6,741
  • 1
  • 23
  • 29

11 Answers11

41

Do we have [an underlying model] for programming languages?

Heavens, yes. And because there are so many programming languages, there are multiple models to choose from. Most important first:

  • Church's untyped lambda calculus is a model of computation that is as powerful as a Turing machine (no more and no less). The famous "Church-Turing hypothesis" is that these two equivalent models represent the most general model of computation that we know how to implement. The lambda calculus is extremely simple; in its entirety the language is

    e ::= x | e1 e2 | \x.e
    

    which constitute variables, function applications, and function definitions. The lambda calculus also comes with a fairly large collection of "reduction rules" for simplifying expressions. If you find an expression that can't be reduced, that is called a "normal form" and represents a value.

    The lambda calculus is so general that you can take it in several directions.

    • If you want to use all the available rules, you can write specialized tools like partial evaluators and parts of compilers.

    • If you avoid reducing any subexpression under a lambda, but otherwise use all the rules available, you wind up with a model of a lazy functional language like Haskell or Clean. In this model, if a reduction can terminate, it is guaranteed to, and it is easy to represent infinite data structures. Very powerful.

    • If you avoid reducing any subexpression under a lambda, and if you also insist on reducing each argument to a normal form before a function is applied, then you have a model of an eager functional language like F#, Lisp, Objective Caml, Scheme, or Standard ML.

  • There are also several flavors of typed lambda calculi, of which the most famous are grouped under the name System F, which were discovered independently by Girard (in logic) and by Reynolds (in computer science). System F is an excellent model for languages like CLU, Haskell, and ML, which are polymorphic but have compile-time type checking. Hindley (in logic) and Milner (in computer science) discovered a restricted form of System F (now called the Hindley-Milner type system) which makes it possible to infer System F expressions from some expressions of the untyped lambda calculus. Damas and Milner developed an algorithm do this inference, which is used in Standard ML and has been generalized in other languages.

  • Lambda calculus is just pushing symbols around. Dana Scott's pioneering work in denotational semantics showed that expressions in the lambda calculus actually correspond to mathematical functions—and he identified which ones. Scott's work is especially important in making sense of "recursive definitions", which are commonplace in computer science but are nonsensical from a mathematical point of view. Scott and Christopher Strachey showed that a recursive definition is equivalent to the least defined solution to a recursion equation, and furthermore showed how that solution could be constructed. Any language that allows recursion, and especially languages that allow recursion at arbitrary type (like Haskell and Clean) owes something to Scott's model.

  • There is a whole family of models based on abstract machines. Here there is not so much an individual model as a technique. You can define a language by using a state machine and defining transitions on the machine. This definition encompasses everything from Turing machines to Von Neumann machines to term-rewriting systems, but generally the abstract machine is designed to be "as close to the language as possible." The design of such machines, and the business of proving theorems about them, comes under the heading of operational semantics.

What about object-oriented programming?

I'm not as well educated as I should be about abstract models used for OOP. The models I'm most familiar with are very closely connected to implementation strategies. If I wanted to investigate this area further I would start with William Cook's denotational semantics for Smalltalk. (Smalltalk as a language is very simple, almost as simple as the lambda calculus, so it makes a good case study for modeling more complicated object-oriented languages.)

Wei Hu reminds me that Martin Abadi and Luca Cardelli have put together an ambitious body of work on foundational calculi (analogous to the lambda calculus) for object-oriented languages. I don't understand the work well enough to summarize it, but here is a passage from the Prologue of their book, which I feel is worth quoting:

Procedural languages are generally well understood; their constructs are by now standard, and their formal underpinnings are solid. The fundamental features of these languages have been distilled into formalisms that prove useful in identifying and explaining issues of implementation, static analysis, semantics, and verification.

An analogous understanding has not yet emerged for object-oriented languages. There is no widespread agreement on a collection of basic constructs and on their properties... This situation might improve if we had a better understanding of the foundations of object-oriented languages.

... we take objects as primitive and concentrate on the intrinsic rules that objects should obey. We introduce object calculi and develop a theory of objects around them. These object calculi are as simple as function calculi, but represent objects directly.

I hope this quotation gives you an idea of the flavor of the work.

Norman Ramsey
  • 188,173
  • 57
  • 343
  • 523
  • @Norman - Thanks a lot for your informative and thorough answer. – Padmarag Mar 19 '10 at 05:01
  • Very comprehensive answer. Although it would be nice if you mention axiomatic semantics too. I haven't read the book "A Theory of Objects" (http://lucacardelli.name/TheoryOfObjects.html) from two reputable Microsoft researchers, but my gut feeling is that this book should cover the theory behind OOP. – Wei Hu Mar 19 '10 at 06:04
  • ++ Nice answer, Norman, especially the Lambda Calculus part. I've known of the Scott-Strachey work, but it's at the boundary of my understanding. When I was at the AI lab, people like Carl Hewitt were trying to extend it to cover side-effects (Actor theory - allied with Smalltalk & OOP), but I'm not sure the end result went very far. – Mike Dunlavey Mar 19 '10 at 11:43
  • @Wei Hu: I debated over axiomatic semantics but in the end this feels to more like program logic and less like what I would call a "model". This is probably because I've been conditioned to distinguish between "syntactic theories" and "models", a distinction I doubt OP had in mind. I'd forgotten about the book by Abadi and Cardelli; I will have to revisit it and see what I can add to my answer. – Norman Ramsey Mar 19 '10 at 20:56
  • +1: excellent answer as always :) – Juliet Mar 19 '10 at 21:05
  • Wow, this answer kind of blew my mind. By the way, Norman, I was the 7777th person to view your profile! Seems like you'd appreciate that. – harpo Mar 19 '10 at 21:11
  • @Norman, why do you say "comes with a fairly large collection of 'reduction rules' for simplifying expressions"? We can get by without alpha-conversion, and when we're not reducing inside abstractions, we don't need eta-conversion either, leaving just one: beta-reduction. Are you thinking of different evaluation strategies as being the many "reduction rules"? – dubiousjim Mar 20 '10 at 12:11
  • @Profjim: Yes. In my view, the various "structural rules" that determine when and where you can perform beta reduction are what distinguish the highly nondeterministic underlying calculus from typically deterministic programming languages. – Norman Ramsey Mar 20 '10 at 18:19
10

Lisp is based on Lambda Calculus, and is the inspiration for much of what we see in modern languages today.

Von-Neumann machines are the foundation of modern computers, which were first programmed in assembler language, then in FORmula TRANslator. Then the formal linguistic theory of context-free-grammars was applied, and underlies the syntax of all modern languages.

Computability theory (formal automata) has a hierachy of machine-types that parallels the hierarchy of formal grammars, for example, regular-grammar = finite-state-machine, context-free-grammar = pushdown-automaton, context-sensitive-grammar = turing-machine.

There also is information theory, of two types, Shannon and Kolmogorov, that can be applied to computing.

There are lesser-known models of computing, such as recursive-function-theory, register-machines, and Post-machines.

And don't forget predicate-logic in its various forms.

Added: I forgot to mention discrete math - group theory and lattice theory. Lattices in particular are (IMHO) a particularly nifty concept underlying all boolean logic, and some models of computation, such as denotational semantics.

Mike Dunlavey
  • 38,662
  • 12
  • 86
  • 126
  • 1
    *Then the formal linguistic theory of context-free-grammars was applied, and underlies the syntax of all modern languages* - a little like saying that the theory of Western classical music underlies the work of The Beatles. :) It can be used to analyse the results, but that doesn't mean it was consciously applied by the creators. Perhaps surprisingly, many modern languages have initially been implemented with "hand-written" top-down parsers, and the BNF representation comes later, and rarely is it particularly useful by itself. C++ is notorious for being intractable by academic parsing theory. – Daniel Earwicker Mar 18 '10 at 22:38
  • @Daniel: You're refining my flip sentence. Thanks. – Mike Dunlavey Mar 19 '10 at 01:40
6

Functional languages like lisp inherit their basic concepts from Church's "lambda calculs" (wikipedia article here). Regards

Giuseppe Guerrini
  • 3,824
  • 15
  • 28
  • I wouldn't, necessarily call lisp functional. It CAN be written in a functional manner but is more of a multi-paradigm language. – Vatine Mar 18 '10 at 13:07
5

One concept may be Turing Machine.

Padmarag
  • 6,741
  • 1
  • 23
  • 29
3

If you study programming languages (eg: at a University), there is quite a lot of theory, and not a little math involved.

Examples are:

T.E.D.
  • 41,324
  • 8
  • 64
  • 131
2

Plenty has been mentioned of the application of math to computational theory and semantics. I like the mention of type theory and I'm glad someone mentioned lattice theory. Here are just a few more.

No one has explicitly mentioned category theory, which shows up more in functional languages than elsewhere, such as through the concepts of monads and functors. Then there's model theory and the various incarnations of logic that actually show up in theorem provers or the logic language Prolog. There are also mathematical applications to foundations of and problems in concurrent languages.

Sephra
  • 23
  • 2
2

The history section of Wikipedia's Object-oriented programming could be enlightening.

OverLex
  • 2,421
  • 1
  • 23
  • 27
2

The closest analogy I can think of is Gurevich Evolving Algebras that, nowadays, are more known under the name of "Gurevich Abstract State Machines" (GASM).

I've long hoped to see more real applications of the theory when Gurevich joined Microsoft, but it seems that very few is coming out. You can check the ASML page on the Microsoft site.

The good point about GASM is that they closely resemble pseudo-code even if their semantic is formally specified. This means that practitioners can easily grasp them.

After all, I think that part of the success of Relational Algebra is that it is the formal foundation of concepts that can be easily grasped, namely tables, foreign keys, joins, etc.

I think we need something similar for the dynamic components of a software system.

Remo.D
  • 15,217
  • 5
  • 41
  • 70
2

There are many dimensions to address your question, scattering in the answers.

First of all, to describe the syntax of a language and specify how a parser would work, we use context-free grammars.

Then you need to assign meanings to the syntax. Formal semantics come in handy; the main players are operational semantics, denotational semantics, and axiomatic semantics.

To rule out bad programs you have the type system.

In the end, all computer programs can reduce to (or compile to, if you will) very simple computation models. Imperative programs are more easily mapped to Turing machines, and functional programs are mapped to lambda calculus.

If you're learning all this stuff by yourself, I highly recommend http://www.uni-koblenz.de/~laemmel/paradigms0910/, because the lectures are videotaped and put online.

Wei Hu
  • 2,789
  • 2
  • 24
  • 27
2

There is no mathematical model for OOP.

Relational algebra in the mathemaical model for SQL. It was created bt E.F. Codd. C.J. Date was also a reknown cientist who helped with this theory. The whole idea is that you can do every operation as a set operation, affecting a lot of values at the same time. This of course means that the database engine has to be told WHAT to get out, and the database is able to optimize your query.

Both Codd and Date criticized SQL because they were involved in the theory, but they were not involved in the creation of SQL.

See this video: http://player.oreilly.com/videos/9781491908853?toc_id=182164

There is a lot of information from Chris Date. I remember that Date criticized the SQL programming language as being a terrible language, but I cannot find the paper.

Teh critique was basically that most languages allow to write expressions and assign variables to those expressions, but SQL does not.

Since SQL is a kind of logical language, I guess you could write relational algebra in Prolog. At least you would have a real language. So you could write queries in Prolog. And since in prolog you have a lot of programs to interpret natural language, you could query your database using natural language.

According to Uncle Bob, databases are not going to be needed when everyone has SSD, because the architecture of SSDs means that access is so fast as RAM. So you can have all your objects in RAM.

https://www.youtube.com/watch?feature=player_detailpage&v=t86v3N4OshQ#t=3287

The only problem with ditching SQL is that you would end up without a query language for the database.

So yes and no, relational algebra was used as inspiration for SQL, but SQL is not really an implementation of relational algebra.

In the case of the Lisp, things are different. The main idea was that implementing the eval function in Lisp you could have the whole language implemented. That's whe the first Lisp implementation is only half a page of code.

http://www.michaelnielsen.org/ddi/lisp-as-the-maxwells-equations-of-software/

To laugh a little bit: https://www.youtube.com/watch?v=hzf3hTUKk8U

The importance of functional programming all comes down to curried functions and lazy calls. And never forget environments and closures. And map-reduce. This all means we will be coding in functional languages in 20 years.

Now back to OOP, there is no formalization of OOP.

Interestingly, the second OO language ever created, Smalltalk, only has objects, it doesn't have primitives or anything like that. And the creator, Alan Kay, explicitly created blocks to work exactly as Lisp functions.

Some people claim OOP could maybe be formalized using category theory, which is kind of set theory but with morphisms. A morphism is a structure preserving map between objects. So in general you could have map( f, collection ) and get back a collection with all elements being f applied.

I'm pretty sure Lisp has that, but Lisp also has functions that return one element in a collection, that destroys the structure, so a morphism is a especial kind of function and because of that, you would need to reduce and limit the functions in Lisp so that they are all morphisms.

https://www.youtube.com/watch?feature=player_detailpage&v=o6L6XeNdd_k#t=250

The main problem with this is that functions don't exist independently of objects in OOP, but in category theory they do. They are therefore incompatible. You could develop a new language in which to express category theory.

An experimental theoretical language created explicitly to try to formalize OOP is Z. Z is derived from requirements formalism.

Another attempt is Luca Cardelli's formalism:

Javahttp://lucacardelli.name/Papers/PrimObjImp.pdf Javahttp://lucacardelli.name/Papers/PrimObj1stOrder.A4.pdf Javahttp://lucacardelli.name/Papers/PrimObjSemLICS.A4.pdf

I'm unable to read and understand that notation. It seems like a useless excercise, since as far as I know, no one has ever implemented this the way lamba calculus was implemented in Lisp.

0

As I know, Formal grammars is used for description of syntax.

iburlakov
  • 4,010
  • 7
  • 36
  • 40
  • 1
    lol. If I understood your English, this would probably be the best answer. Shame I can't vote 1/2 a point for being on the right track. – T.E.D. Mar 18 '10 at 13:01
  • not quite correct. Formal grammar just specify the syntactic part. The complete semantic of a piece of software can't specified with a grammar. Relational Algebra, instead, can be used to fully specify how a DB works. – Remo.D Mar 18 '10 at 13:04
  • What exactly is 'not quite correct' about 'Formal grammars is used for description of syntax'? You said exactly the same thing yourself. – user207421 Mar 19 '10 at 06:20
  • The original question is about Relational Algebra that describes the semantic of a database. The OP asked if there was something similar for programming languages and OOP, i.e. something that could describe the semantic of a piece of code, not its syntax. – Remo.D Mar 19 '10 at 07:06