ponyfoo.com

A Brief History of Modularity

Fix
A relevant ad will be displayed here soon. These ads help pay for my hosting.
Please consider disabling your ad blocker on Pony Foo. These ads help pay for my hosting.
You can support Pony Foo directly through Patreon or via PayPal.

When it comes to JavaScript, modularity is a modern concept. In this article we’ll quickly revisit and summarize the milestones in how modularity evolved in the world of JavaScript. This isn’t meant to be a comprehensive list, by any means, but instead it’s meant to illustrate the major paradigm changes along the history of JavaScript.

Script Tags and Closures

In the early days, JavaScript was inlined in HTML <script> tags. At best, it was offloaded to dedicated script files, all of which shared a global scope.

Any variables declared in one of these files or inline scripts would be imprinted on the global window object, creating leaks across entirely unrelated scripts that might’ve lead to conflicts or even broken experiences, where a variable in one script might inadvertently replace a global that another script was relying on.

Eventually, as web applications started growing in size and complexity, the concept of scoping and the dangers of a global scope became evident and more well-known. Immediately-invoking function expressions (IIFE) were invented and became an instant mainstay. An IIFE worked by wrapping an entire file or portions of a file in a function that executed immediately after evaluation. Each function in JavaScript creates a new level of scoping, meaning var variable bindings would be contained by the IIFE. Even though variable declarations are hoisted to the top of their containing scope, they’d never become implicit globals, thanks to the IIFE wrapper, thus suppressing the brittleness of implicit JavaScript globals.

Several flavors of IIFE can be found in the next example snippet. The code in each IIFE is isolated and can only escape onto the global context via explicit statements such as window.fromIIFE = true.

(function() {
  console.log('IIFE using parenthesis')
})()

~function() {
  console.log('IIFE using a bitwise operator')
}()

void function() {
  console.log('IIFE using the void operator')
}()

Using the IIFE pattern, libraries would typically create modules by exposing and then reusing a single binding on the window object, thus avoiding global namespace pollution. The next snippet shows how we might create a mathlib component with a sum method in one of these IIFE-based libraries. If we wanted to add more modules to mathlib, we could place each of them in a separate IIFE which adds its own methods to the mathlib public interface, while anything else could stay private to the component that defined the new portion of functionality.

void function() {
  window.mathlib = window.mathlib || {}
  window.mathlib.sum = sum

  function sum(...values) {
    return values.reduce((a, b) => a + b, 0)
  }
}()

mathlib.sum(1, 2, 3)
// <- 6

This pattern was, coincidentally, an open invitation for JavaScript tooling to burgeon, allowing developers – for the first time – to safely concatenate every IIFE module into a single file, reducing the strain on the network.

The problem in the IIFE approach was that there wasn’t an explicitly dependency tree. This means developers had to manufacture component file lists in a precise order, so that dependencies would load before any modules that dependend on them did – recursively.

RequireJS, AngularJS, and Dependency Injection

This is a problem we’ve hardly had to think about ever since the advent of module systems like RequireJS or the dependency injection mechanism in AngularJS, both of which allowed us to explicitly name the dependencies of each module.

The following example shows we might define the mathlib/sum.js library using RequireJS’s define function, which was added to the global scope. The returned value from the define callback is then used as the public interface for our module.

define(function() {
  return sum

  function sum(...values) {
    return values.reduce((a, b) => a + b, 0)
  }
})

We could then have a mathlib.js module which aggregates all functionality we wanted to include in our library. In our case, it’s just mathlib/sum, but we could list as many dependencies as we wanted in the same way. We’d list each dependency using their paths in an array, and we’d get their public interfaces as parameters passed into our callback, in the same order.

define(['mathlib/sum'], function(sum) {
  return { sum }
})

Now that we’ve defined a library, we can consume it using require. Notice how the dependency chain is resolved for us in the snippet below.

require(['mathlib'], function(mathlib) {
  mathlib.sum(1, 2, 3)
  // <- 6
})

This is the upside in RequireJS and its inherent dependency tree. Regardless of whether our application contained a hundred or thousands of modules, RequireJS would resolve the dependency tree without the need for a carefully maintained list. Given we’ve listed dependencies exactly where they were needed, we’ve eliminated the necessity for a long list of every component and how they’re related to one another, as well as the error-prone process of maintaining such a list. Eliminating such a large source of complexity is merely a side-effect, but not the main benefit.

This explicitness in dependency declaration, at a module level, made it obvious how a component was related to other parts of the application. That explicitness in turn fostered a greater degree of modularity, something that was ineffective before because of how hard it was to follow dependency chains.

RequireJS wasn’t without problems. The entire pattern revolved around its ability to asynchronously load modules, which was ill-advised for production deployments due to how poorly it performed. Using the asynchronous loading mechanism, you issued hundreds of networks requests in a waterfall fashion before much of your code was executed. A different tool would have to be used to optimize builds for production. Then there was the verbosity factor, where you’d end up with long lists of dependencies, a RequireJS function call, and the callback for your module. On that note, there were quite a few different RequireJS functions and several ways of invoking those functions, complicating its use. The API wasn’t the most intuitive, because there were so many ways of doing the same thing: declaring a module with dependencies.

The dependency injection system in AngularJS suffered from many of the same problems. It was an elegant solution at the time, relying on clever string parsing to avoid the dependency array, using function parameter names to resolve dependencies instead. This mechanism was incompatible with minifiers, which would rename parameters to single characters and thus break the injector.

Later in the lifetime of AngularJS v1, a build task was introduced that would transform code like the following:

module.factory('calculator', function(mathlib) {
  // …
})

Into the format in the following bit of code, which was minification-safe because it included the explicit dependency list.

module.factory('calculator', ['mathlib', function(mathlib) {
  // …
}])

Needless to say, the delay in introducing this little-known build tool, combined with the over-engineered aspect of having an extra build step to unbreak something that shouldn’t have been broken, discouraged the use of a pattern that carried such a negligible benefit anyway. Developers mostly chose to stick with the familiar RequireJS-like hardcoded dependency array format.

Node.js and the Advent of CommonJS

Among the many innovations hailed by Node.js, one was the CommonJS module system – or CJS for short. Taking advantage of the fact that Node.js programs had access to the file system, the CommonJS standard is more in line with traditional module loading mechanisms. In CommonJS, each file is a module with its own scope and context. Dependencies are loaded using a synchronous require function that can be dynamically invoked at any time in the lifecycle of a module, as illustrated in the next snippet.

const mathlib = require('./mathlib')

Much like RequireJS and AngularJS, CommonJS dependencies are also referred to by a pathname. The main difference is that the boilerplate function and dependency array are now both gone, and the interface from a module could be assigned to a variable binding, or used anywhere a JavaScript expression could be used.

Unlike RequireJS or AngularJS, CommonJS was rather strict. In RequireJS and AngularJS you could have many dynamically-defined modules per file, whereas CommonJS had a one-to-one mapping between files and modules. At the same time, RequireJS had several ways of declaring a module and AngularJS had several kinds of factories, services, providers and so on – besides the fact that its dependency injection mechanism was tightly coupled to the AngularJS framework itself. CommonJS, in contrast, had a single way of declaring modules. Any JavaScript file was a module, calling require would load dependencies, and anything assigned to module.exports was its interface. This enabled better tooling and code introspection – making it easier for tools to learn the hierarchy of a CommonJS component system.

Eventually, Browserify was invented as way of bridging the gap between CommonJS modules for Node.js servers and the browser. Using the browserify command-line interface program and providing it with the path to an entry point module, one could combine an unthinkable amount of modules into a single browser-ready bundle. The killer feature of CommonJS, the npm package registry, was decisive in aiding its takeover of the module loading ecosystem.

Granted, npm wasn’t limited to CommonJS modules or even JavaScript packages, but that was and still is by and large its primary use case. The prospect of having thousands of packages (now over half million and steadily growing) available in your web application at the press of a few fingertips, combined with the ability to reuse large portions of a system on both the Node.js web server and each client’s web browser, was too much of a competitive advantage for the other systems to keep up.

ES6, import, Babel, and Webpack

As ES6 became standardized in June of 2015, and with Babel transpiling ES6 into ES5 long before then, a new revolution was quickly approaching. The ES6 specification included a module system native to JavaScript, often referred to as ECMAScript Modules (ESM).

ESM is largely influenced by CJS and its predecessors, offering a static declarative API as well as a promise-based dynamic programmative API, as illustrated next.

import mathlib from './mathlib'
import('./mathlib').then(mathlib => {
  // …
})

In ESM, too, every file is a module with its own scope and context. One major advantage in ESM over CJS is how ESM has – and encourages – a way of statically importing dependencies. Static imports vastly improve the introspection capabilities of module systems, given they can be analyzed statically and lexically extracted from the abstract syntax tree (AST) of each module in the system. Static imports in ESM are constrained to the topmost level of a module, further simplifying parsing and introspection.

In Node.js v8.5.0, ESM module support was introduced behind a flag. Most evergreen browsers also support ESM modules behind flags.

Webpack is a successor to Browserify that largely took over in the role of universal module bundler thanks to a broader set of features. Just like in the case of Babel and ES6, Webpack has long supported ESM with both its import and export statements as well as the dynamic import() function. It has made a particularly fruitful adoption of ESM, in no little parts thanks to the introduction of a “code-splitting” mechanism whereby it’s able to partition an application into different bundles to improve performance on first load experiences.

Given how ESM is native to the language, – as opposed to CJS – it can be expected to completely overtake the module ecosystem in a few years time.

Liked the article? Subscribe below to get an email when new articles come out! Also, follow @ponyfoo on Twitter and @ponyfoo on Facebook.
One-click unsubscribe, anytime. Learn more.

Comments (3)

Harry wrote

After using dependency injection in both java and angular, es6 to me seems like a step back. Commonjs at least had it simple and flexible enough for frameworks to implement inversion of control on the top of it. I can’t believe there is no existing language with DI built-in yet. ML’s functor is the closest we’ve got but no injection, may modular implicits be the answer?

Gabriel Theron wrote

There’s one thing you didn’t mention about ES6 modules : the fact that one file may have several exports, and also name imports. This makes constant sharing very convenient, while keeping dependencies trackable:

// file1.js
export const MY_CONST = 'my_const';

export default something...


// file2.js
// I can choose to only import MY_CONST
import { MY_CONST } from 'file1';

doSomething(MY_CONST);

// I could also rename the import
import { MY_CONST as someConst } from 'file1';

doSomething(someConst);
Nicolás Bevacqua wrote

Absolutely! This is what I allude to when I write:

Static imports vastly improve the introspection capabilities of module systems, given they can be analyzed statically and lexically extracted from the abstract syntax tree (AST) of each module in the system. Static imports in ESM are constrained to the topmost level of a module, further simplifying parsing and introspection.

Static imports and the ability to selectively import what you need enables tools like Rollup or Webpack to optimize bundles by removing unused exports, in a process that’s been dubbed “tree shaking”.