Don't hate COBOL until you've tried it

Don't hate COBOL until you've tried it

It's the Rodney Dangerfield of computer programming, but COBOL is still in use—and really does deserve respect.

What to like about COBOL
Image by : 

Rainer Gerhards. Modified by Opensource.com. CC BY-SA 4.0

Get the newsletter

Join the 85,000 open source advocates who receive our giveaway alerts and article roundups.

COBOL is the Rodney Dangerfield of programming languages—it doesn't get any respect. It is routinely denigrated for its verbosity and dismissed as archaic. Yet COBOL is far from a dead language. It processes an estimated 85% of all business transactions, and 5 billion lines of new COBOL code are written every year.

I worked for 10 years as a COBOL programmer, and I don't think it's quite as bad as its reputation would lead one to believe. In fact, it's quite good at handling currency and fixed-format records. But COBOL does have its quirks, many of them rooted in the computing environments of the early days of programming. This is a story of how a punch card ate my program.

A mysterious bug

Here's an example of the problem code, which tries to compute the shipping charge and expected shipping date for an order:

 1      identification division.
 2      program-id.
 3          test-ship.
 4
 5      data division.
 6      working-storage section.
 7
 8      01 shipping-method            pic x(2) value 'US'.
 9      01 cust-type                  pic x(2) value 'EM'.
10      01 normal-ship-date-yyyymmdd  pic 9(8) value 20170522.
11      01 nextday-ship-date-yyyymmdd pic 9(8) value 20170508.
12      01 expected-shipping-date     pic 9(8).
13      01 shipping-charge            pic 99v99 value 4.99.
14
15      procedure division.
16          if shipping-method <> 'FX'
17              move normal-ship-date-yyyymmdd to expected-shipping-date
18          else
19              move nextday-ship-date-yyyymmdd to expected-shipping-date.
20
21          if cust-type = 'EM'
22              move 0 to shipping-charge.
23
24          display expected-shipping-date.
25          display shipping-charge.

The logic should be easy to follow, even if you've never seen COBOL code before. If the shipping method is "FX," the customer gets next-day shipping, otherwise shipping takes two weeks. (This was the '90s.) Employees get free shipping; everyone else pays $4.99. This code looked correct to me, but it has a bug—the shipping date is calculated correctly, but employees are charged full shipping.

The problem turned out to be the period at the end of 19. Back then, it took some detective work to track this down, but modern syntax-highlighting editors would flag it immediately. But why was this a problem? Why didn't COBOL like that period, when it was perfectly happy with the one at the end of line 22?

Sentences instead of blocks

To answer that question, we need to go back to COBOL's origins in the late 1950s. Until then, most languages were designed to solve scientific and engineering problems, so their syntax resembled mathematical equations. (Fortran is the classic example of this type of language.) COBOL, on the other end, was intended for business computing. To make it easier for lay people to learn, Grace Hopper and her team of Defense Department and IBM engineers gave COBOL an English language syntax. Instead of the recursive syntax most modern languages have, COBOL programs have a hierarchical structure. Instead of blocks, COBOL groups statements together into "sentences." And just like in English, each sentence is terminated by a period.

While this may have seemed like a good idea in theory, in practice it proved to be problematic. It made it hard to move code around, since a stray period might terminate a block unexpectedly. Periods were also hard to notice—they were often just a single pixel on '90s-era CRT terminals. But there was a deeper problem here, one that relates to how programmers wrote code when COBOL was first developed.

Punch cards

Hard drives were prohibitively expensive when COBOL was designed, so most programs were written on punch cards. The most common punch cards consisted of a 12x80 grid, where a hole represented a 1 and a non-hole a 0. Each column was a single 12-bit character, and each card was a single 80-character line of text. To run your program, you'd feed a deck of punch cards into a card reader. The first six and final eight columns of each card were reserved for sequence numbers and identifiers. That way if you dropped your deck—which might be your only copy of your program—you could feed the cards through a mechanical sorter to put them back into the correct order.

What this means is that COBOL ignores any characters after column 72. If that happened to be a period, the entire logic of your code could change. And, as you've no doubt guessed by now, that period on line 19 was in column 73. Here's how the COBOL compiler actually interpreted those lines:

16          if shipping-method <> 'FX'
17              move normal-ship-date-yyyymmdd to expected-shipping-date
18          else
19              move nextday-ship-date-yyyymmdd to expected-shipping-date
20
21              if cust-type = 'EM'
22                  move 0 to shipping-charge.

Once I discovered what the problem was, the fix was easy: I deleted one character of white space from the beginning of line 19, which put the period at column 72. Although I'd never encountered it before, this was such a common bug that many mainframe COBOL programmers would tape a piece of thread between columns 72 and 73 on their terminals.

COBOL today

The COBOL-85 standard added scope terminators like end-if, so periods were no longer necessary to end sentences. The COBOL 2002 standard allowed free-form code, although many compilers had support long before that. The same code written to the 2002 standard looks more like a modern programming language:

16 if shipping-method <> 'FX'
17     move normal-ship-date-yyyymmdd to expected-shipping-date
18 else
19     move nextday-ship-date-yyyymmdd to expected-shipping-date
20 end-if
21
22 if cust-type = 'EM'
23     move 0 to shipping-charge
24 end-if

Note that the white space at the beginning of the line is also no longer necessary. The system I usually worked on supported both scope terminators and free-form code, so I never ran into this issue until I had to make some changes on another system.

Learning about COBOL has been difficult for open source enthusiasts. COBOL compilers have traditionally been closed source and expensive, and most COBOL code is written in corporate environments. However, work on an open source compiler called OpenCOBOL began in 2002. In 2013, it was officially accepted as a GNU package and renamed GnuCOBOL. To learn more about GnuCOBOL, including accessing the 400-page programmers guide, visit the project's homepage.

Learn more about COBOL in Walt Mankowski's talk, A Punch Card Ate My Program!, at FOSSCON August 26th in Philadelphia.

About the author

Walt Mankowski - Walt Mankowski is a recovering ivory tower computer scientist who recently completed a postdoc working with biologists to process and visualize terabytes of 2D and 3D time lapse microscope images. In his past life he spent 10 years as a COBOL programmer at a major cable home shopping network. He enjoys Perl, regular expressions, high-performance computing, and Futurama.