Creating source to source translator

Question

I want to know that what are the strategies to create a source to source translator i.e creating translation from one high level language to another. The two ways that come into my mind are

1- Changing syntax tree of one language to other language syntax tree 2- Changing it to intermediate language and then converting that to other high level language

My question is that is it possible to do the conversion using both strategies and which is more feasible to do, can anyone give some refernces to any theory or implementation done by some converter like any of above methods. And is there any standard xml based intermediate language, i know that xmlvm uses xml as intermediate language but it does not provide any proper specification of the intermediate language.

See my SO answer on translating between programming languages: http://stackoverflow.com/a/3460977/120163. It is about real industrial tools to do this, not theory. — Ira Baxter, Feb 21 '13 at 11:01

score 8 · Accepted Answer · edited Jun 23 '17 at 15:27

8

Any compiler is, roughly, a source-to-source translator. Target language can be an assembly language (or directly a binary machine code language), or C, or whatever high level language you fancy. So, the general compilers theory is applicable.

And just as a word of advice - one intermediate language is normally not nearly enough. Use more. Use dozens of intermediate languages, each different from a previous one in just one tiny aspect. This way any language-to-language translation is nothing but trivial.

Another word of advice (anticipating downvotes here) - stay away from XML, especially as a representation for ASTs.

edited Jun 23 '17 at 15:27

Basil Bourque

218,480
72
657
915

answered Aug 22 '11 at 11:51

SK-logic

9,172
1
23
32

2

Agree with SK-logic about XML. See http://stackoverflow.com/a/2831343/120163 for some discussion about the (de)merits of XML as an AST representation. – Ira Baxter Feb 21 '13 at 11:04

score 2 · Answer 2 · answered Nov 15 '11 at 05:23

2

I would look at LLVM, which can do source to source. Although the output isn't pretty, it might provide some good ideas.

answered Nov 15 '11 at 05:23

J. M. Becker

2,561
27
31

1

It can do source to source for what languages? I think LLVM/Clang is pretty hardwired into C and C++, and other langauges are likely to be pretty difficult. – Ira Baxter Jul 06 '13 at 14:42
@Ira Baxter: LLVM has a CBackend, which compiles LLVM BitCode to C. LLVM also has a JavaScript backend project, named emscripten. So technically any frontend which outputs to LLVM, can use either of those backends. This doesn't consider projects in development, or projects which will be in development. Example frontends would be Clang, llvm-lua, llvm-gcc, LDC... etc. It's not a 'full solution' but if someone was working on some ideas, and wanted some code to actually view, it would be a place to start. – J. M. Becker Jul 09 '13 at 14:41

score 0 · Answer 3 · answered Sep 07 '13 at 06:51

0

Try Clang! It is powerful for source-to-source translation. As of now it fully supports C, C++, Objective C and Objective C++. You may also want to look at ROSE compiler infrastructure.

answered Sep 07 '13 at 06:51

username_4567

4,345
10
51
86

score 0 · Answer 4 · answered Aug 22 '11 at 10:49

The converters are usually based on constructing the semantic tree of one program and then re-shaping it to the target PL. As an example, take a look at C# to Java convertor.

The second approach is also possible, but the organization of your code may change completely after conversion. So, it is better to keep the intermediate common structure (IL, ST, etc), as high level as possible.

Creating source to source translator

4 Answers4