Blog

Links

Curly braces vs. whitespace

posted: December 7, 2019

tl;dr: You’re probably doing both anyway, and may enjoy just having to do one...

I’ve spent most of my career working in languages that used curly braces {} to delineate blocks of code or data structures, and to define hierarchy. C and its derivatives use them, as do Java, JavaScript and JSON, to cite just a few of today’s most popular languages. Yet there are popular languages that eschew braces in favor of mandatory whitespace: YAML, which is sometimes used instead of JSON to define data objects, and Python.

After a couple of decades using curly-brace languages, Python was my immersion into a whitespace language (note: I am not talking about the “esoteric programming language” named “whitespace”). Initially I was as skeptical as this character in the famous XKCD import antigravity cartoon:

However, as almost everyone who jumps into Python from another language reports, after a short period of time I got very used to using whitespace and soon preferred it. I then wondered why more languages haven’t done the same as Python.

The answer, I feel, is a combination of tradition and bias towards making things easier for the computer instead of the programmer. C was wildly popular back in the day, and many languages that have appeared since C (including Python) take hints from the way that C implemented things. I’ve written parsers myself, and having dedicated characters to delineate blocks makes it easier to write a parser. It also allows the source code to be minified by removing as much whitespace as possible so that the resulting image can be transferred as quickly as possible over a low-bandwidth network where every character counts. This is less of a concern than it was back in the days of analog modems, but it still matters in some situations.

However, braces make life slightly more difficult for the programmer. Programmers don’t typically write highly minified code: we need to write code that is readable, by ourselves now and in the future, as well as by others. So we also indent our code by using whitespace. When writing in a curly-braces language, we need to both define our blocks correctly with braces and indent the blocks correctly.

Modern IDEs help, but sometimes, typically when copy-pasting code snippets, things get out-of-sync: we forget to grab all the closing braces for a block, or we insert the block at a different level of the hierarchy that requires different indentation. Sometimes the IDE can figure out what we’re trying to do, but sometimes not. Then we’re left hunting around to make sure all the closing braces are in the right places, and also that the indentation is correct.

For a whitespace language, half of that challenge is avoided: you just have to get the indentation correct. Getting one thing right is easier than getting two things right, so it’s a little easier and faster to program in a whitespace language.

The other side effect of whitespace languages, which is especially true when comparing YAML to JSON, is that you avoid large number of lines which contain solely punctuation to close out blocks in a highly nested hierarchy. A higher percentage of lines in the file contain text, the code looks cleaner, and more of the logic is visible on the screen at any point in time. The file size is shorter. At my last company, our front end developer was the only one who had a monitor that he kept rotated into portrait mode, so that he could look at long files of JSON and JavaScript.

Machines can understand highly cryptic, poorly-formatted (even minified) code with no problems. The primary audience of source code, however, is humans. To be readable, source code should be indented and contain whitespace. And if whitespace is all that humans need to understand code, as the popularity of YAML and Python would appear to demonstrate, then that’s the better option for language designers.

Now, if I really want to get an argument started, we can discuss how many spaces to use when indenting blocks. All I’ll say about this is that I’m glad that Guido standardized on 4 spaces for Python (although the CPython interpreter accepts other values), so almost all Python code is consistently formatted. YAML, however, is a different story.

Now, about those semicolons...