Blog

Links

Devops script comparison: Python and NodeJS

posted: November 30, 2019

tl;dr: Syntax is not actually the major difference in these two implementations...

I recently had the pleasure of porting a devops script I originally wrote in Python, as a proof-of-concept, into NodeJS. I don’t often get the chance to rewrite a program in a new language. It’s a fun intellectual exercise, and offers a great opportunity to compare implementations head-to-head.

The task

The script starts with the output of a web scraper, a directory tree containing all the site files: HTML, CSS, JS, and other assets. It traverses the site files and does some movement of the shared site resources, which requires adjusting links in the site files. It then packages the resulting files and directories into a specific file format used by another tool, which involves ZIPping various directories and then creating a single ZIP archive of everything.

The results by the numbers

Python NodeJS
Lines of code 167 213
Script file size (bytes) 7470 8397
Number of files 1 2
Third-party dependencies (explicit) 0 2
Third-party dependencies (total) 0 6

Discussion

Python and NodeJS are actually fairly similar languages; someday I’ll write a blog post about this. Both languages are dynamically typed with optional static type checking, are highly object oriented, support multiple programming paradigms, and do synchronous and asynchronous operations. Python features simpler syntax and a somewhat more powerful feature set. JavaScript appears to be catching up with many of Python’s syntax and features (e.g. adding generators in ES6), but unless it throws away backwards compatibility, JavaScript will always be messier and quirkier, with multiple ways of doing exactly or nearly the same thing (e.g. ‘var’ vs. ‘let’ vs. ‘const’).

But the biggest difference in the implementation is not due to syntactical differences in the two languages. It instead results from Python’s richer, more powerful standard library that ships with the Python3 run-time environment.

When I write Python devops scripts, knowing that others may someday run them on their own machines, I always try to see if I can write the script by using only the Python standard library. That way, when the script is run on another machine, there is nothing else that needs to be installed. As long as the user has Python itself, the script should run. This can save a lot of hassles: installing dependencies can be fraught with unforeseen difficulties and conflicts.

For this task, the Python standard library provides everything needed, and there was no need to go to the Python Package Index to ‘pip install’ a third-party library. Python itself provides powerful libraries for file system manipulations, ZIP files, and even spawning other processes if there is a need to invoke other programs.

Although in theory it should have been possible for me to write my NodeJS script without having to ‘npm install’ any third party libraries (NodeJS is Turing complete, right?), in reality it would have been a case of reinventing the wheel. The first problem I ran into was in the NodeJS ‘fs’ module, which provides basic file system features. Strangely enough it doesn’t provide a simple way to recursively traverse an entire directory tree, so I wrote my own nearly 20-line JavaScript implementation of Python’s `os.walk()` function. The npm package manager has similar implementations, but I wasn’t ready to pull in a third party library just to do this one operation.

I gave up when I realized that the ‘fs’ module doesn’t provide a way to move an entire directory with all its contents. This is a pretty common situation, and sure enough someone has extended the ‘fs’ module by creating fs-extra, which features a ‘fs.moveSync()’ function, which behaves very similarly to Unix’s ‘mv’ command. The usage stats for ‘fs-extra’ show it to be quite popular; perhaps the NodeJS folks will consider bringing it directly into NodeJS someday. Doing so would involve bringing in four other packages, as ‘fs-extra’, like many NodeJS modules on npm, itself has dependencies.

The other third party package I had to research, find, and use was adm-zip for creating ZIP files and archives. ZIP files have been around for decades, but they are not yet natively supported by NodeJS.

Having to rely upon third-party libraries introduces multiple challenges. Standard libraries are tested and supported by the core language development team; third-party libraries are of less certain quality. There are more stability and security issues with third-party libraries. Choosing the best third-party library can be hard; often there are multiple options. The quality of the documentation for third-party libraries varies and is typically not as good as the standard library. There are usually fewer blog posts, tutorials, and other learning materials available for third-party libraries. Third-party libraries can be extremely helpful, and I certainly used them in this NodeJS project, but if the functionality they provided was in the standard library, less work would be required to use them. Python’s “batteries included” standard library wins out, in this case.

The other differences in the implementations are mainly due to syntactical differences in the language. The higher line count for the NodeJS script is primarily because of the ‘os.walk()’ function plus the need in JavaScript to both indent code (to make it readable) and to close off trailing braces and parentheses, which produces code snippets like this:

      }
    });
  });
}

The slightly higher character count for NodeJS is because JavaScript is a slightly more verbose language, in both words and punctuation. The higher file count is because the NodeJS script needed a dependencies file because of the third-party libraries.

I’ve been impressed over the years to watch the evolution of JavaScript and NodeJS. The syntax has gotten cleaner, and more powerful features have been added to the language. NodeJS can be used for significant server-side applications, and it is certainly possible to use it for devops tasks. My main suggestion to the NodeJS community would be to beef up the standard library to make NodeJS more powerful out-of-the-box, and so that developers don’t have to be so reliant upon npm.