13

Why does COBOL have to be indented, as in, have additional spacing in each sourcefile?

Consider this code (note the additional whitespace):

  IDENTIFICATION DIVISION.
  PROGRAM-ID. HELLO-WORLD.
  PROCEDURE DIVISION.
      DISPLAY 'Hello, world'.
      STOP RUN.

A similar formatting can be seen in Fortran code:

   program hello
      print *, "Hello World!"
   end program hello

But why do COBOL and Fortran need this whitespace? What's the reason?

Vladimir F
  • 50,383
  • 4
  • 60
  • 96
hiobs
  • 566
  • 1
  • 5
  • 19

1 Answers1

21

Cobol no longer has to be indented. AFAIK, all most modern compilers support format free Cobol source.

The original reason was dealing with punch cards. Cobol kept the first 6 positions for a line sequence number. Column 7 was a continuation / comment / debug / form-feed. Area "A", or Columns 8-11, indicated certain special language artifacts like 01 levels, section or paragraph names, et al. Area "B", or Columns 12 - 72, was for open code. Columns 73 - 80 were for OS sequence numbers.

The two languages you mention, Cobol and Fortran, were both written before automatic parser generation existed. And they had no real prior art to draw on for good and bad ideas of how to create parse-able source text. So some of the things -- like Area "A" for special section headers -- made the task of manually writing parsers easier. Modern languages tend to use context free grammars to make parser generation simple. But that postdates Cobol.

Joey
  • 316,376
  • 76
  • 642
  • 652
Joe Zitzelberger
  • 4,168
  • 1
  • 26
  • 40
  • 3
    The IBM Enterprise COBOL compiler (mainframe) still requires columns. – zarchasmpgmr Jan 22 '12 at 21:12
  • Indeed you are correct. SRCFORMAT(EXTEND) is only available on IBM Cobol for AIX. – Joe Zitzelberger Jan 23 '12 at 20:32
  • I don't see the immediate connection between the use of punch cards and the indentation rules. The second part of your answer seems to give a more plausible reason. – eriktous Jan 24 '12 at 14:20
  • 9
    The whole idea of sequence numbers was so that if you dropped your deck on the floor, you could put it in the card sorter and get it back in order. – Joe Zitzelberger Jan 24 '12 at 15:06
  • That is a reason to start each line with a sequence number, but doesn't explain the other areas after that. Please don't misunderstand me, I'm not trying to be a smart ass; I'm just interested in the reasons behind these coding rules. I think your point about making it easier to parse sounds like the most reasonable explanation. – eriktous Jan 24 '12 at 23:03
  • If I had to guess, I would say the Area A / Area B thing came from Assembler. You put labels and comments in the far left and actual code indented a few bytes. So that seems likely. And the contextful grammar thing... – Joe Zitzelberger Jan 25 '12 at 02:27
  • I dont kow about COBOL but FORTRAN doesn't (never did) require any further indenting beyond the first 6 columns being reserved. – agentp Apr 18 '12 at 20:21
  • 1
    The reason for the Area A/B is for documentation. In Fortran, you get something like GOTO 100 or GOTO 200. In Cobol, it would be GO TO 100-RESTART or GO TO 200-NEXT-PERSON. No need for comments: the label (if thought out properly) tells you why you are going there. Mimicing Fortran, they have their own area so that they stand out and are easy to find (also helps if they are in numerical order). – cup Jun 24 '13 at 20:04
  • 3
    It's hard to explain without teaching what it was like in early days. When I learned COBOL, we had an IBM 1401 with 8K main memory and a 4K expansion box and a 2.5MB disk (and no "OS"). Until you try writing a COBOL compiler for such an environment, understanding how formatting rules can help the compiler is difficult. – user2338816 May 20 '16 at 05:22
  • Context free grammars have precisely nothing to do with Area A or columnar formats. These are dealt with in the scanner, and they are a royal PITA. Algol-60 was also designed prior to parser generators being available too, but they just rethought the problem and did away with the,. Modern languages have no columns and collapse whitespace for a reason. It is simpler. I've written Cocol compilers too, without compiler generators, and handling Area A and columns is just extra code. Not an advantage in any way. – user207421 May 28 '16 at 00:06
  • I agree with you that writing a Cobol compiler is not hard. But I wasn't talking about Area A and the context free grammar, but more the very complicated 1,000-ish reserved work language that does not lend itself well to simple BNF notation because much of it is contextual. Reading it, I realize I kinda did a mashup of several concepts there. – Joe Zitzelberger May 30 '16 at 14:29