On the Bus Factor and the Node.js Ecosystem
Weekend projects gone wild
A week before the Christmas break, I had to explain to my boss what the "bus factor" (also known as "truck factor") was. It was in relation to our team choosing which "path spec" to use (e.g. JSONPath vs. XPath vs. JMESPath, etc) in a key component that would be used across several teams (ourselves included).
He is an enterprise developer, so understandably, he assumed that any of the specs (specifically, JMESPath, as it was the "default" for other teams) would be "agonistic" enough to be well-supported on the platforms we would need to implement this component in (node.js and python).
But, I burst his bubble with the dirty, impure reality - that the node implementation was broken, and out-of-spec, because the "bus factor" was 1, and that sole person keeping the library alive had simply stopped maintaining it 4 years ago.
(For anyone who's unaware, the "bus factor" is the minimum number of people that could be "hit by a bus" in a way that will kill the project's development. Basically, think of the "bus factor" as "how many people are actually keeping this project alive?")
And that meant that either we (aka me) had to write & support the node implementation or drop it and go to a spec that was supported more widely across languages.
So, dear reader, why can't we have nice things?
In this particular case, the sole author of the spec simply lost interest in node and has shifted interest to Golang, and that's the implementation that's most actively being developed. That's it.
(Almost) Everything has a Bus Factor of 1
Basically 99% of libraries that you ever use, that you ever depend on, that your dependencies depend on (and so on) are so-called "weekend projects" - a project that someone has written on their free time over the weekend to address one, small, specific problem that they faced.
And most projects stay that way - small, niche, bit of code that someone just happened to publish, something only they will use.
But what happens in Vegas doesn't stay in Vegas, and by pure luck, some libraries gain traction with the broader community (see: express, passportjs, and basically every popular framework/library on npm).
And as the library grows in userbase, the library has to become more and more "generic" to fit many people's use cases. And while people will raise issues and even submit PRs, it is still that one guy who has to review, investigate, or merge it.
Eventually, it will get to the point where just maintaining that library will be too much for him. Or maybe he loses interest. Either way, now the users (or libraries) who depend on that library are left with buggy, outdated, unmaintained relic of the past.
To this, people typically raise two points:
- "Just fork it, then (and stop complaining)!"
- "Well, why don't they just add more people to help with that?"
And like basically everything in software engineering, the part that you see is the tip of the iceberg, and solutions to address said tip misses out on the 90% of the whole.
So why are these two points not "viable" "solutions" to the bus factor problem?
When you fork an unmaintained library, you basically assume all of the responsibility onto yourself. You are now the flag-bearer, and as the size of the original library grows, so do your responsibilities - even if all you wanted to do was to fix one little bug that broke prod.
When you fork a library that's maintained in one way or another, you now have to compete to get community traction (because again, stale code dies real quick without TLC), and (if you did not hard fork it) to copy whatever additions/bug fixes from the original library, because people will make PRs on the original but not your own.
And getting people to "just help with development" is no easy task, either. For the same reason software engineering interviews are broken, it's hard to determine who will actually be a good co-maintainer. Some people are geniuses but bad at taking feedback, some people only has cursory understanding of the library, some people are just flat-out assholes and trip on power (even the most minute "power" such as being a co-maintainer on a small library) - we'll come back to this later.
I'm sure I do not need to explain why you can't just go around adding everyone as maintainers for your repo, but in general, a general rule is to add contributors who seem to have critical understanding of your library - not just of its code, but also its users, the community, and its raison d'être.
Unfortunately, this process is slow as hell. Open source in general is like molasses in terms of getting people to actually help you. When a library "explodes" in popularity, it typically just means exponentially growing responsibility without exponentially growing community contributions.
It's basically doing tech support for free, and you only have so much free time to dedicate to telling people to read the documentation.
Not Free as in Beer
And this, dear reader, is the core problem of why we seemingly always end up with bus factor of 1 - everything costs time and effort, which indirectly costs hard dollars.
Open source is "free" as in freedom, not as in beer.
It costs so much money and attention to keep a bundle of code alive. No code is "perfect" (unless you're writing "hello world"), our assumptions get dated, better ways of doing things are found, security vulnerabilities are discovered, specs and requirements change over time, the platforms (language runtimes, compilers, etc) that the code runs on change, community/company support for literally anything that the code implicitly depends (whether it be dependencies, the dependencies of your dependencies, or even the OS) on can go away because there's only so much legacy support these companies and communities can provide.
And yet there's only a finite amount of attention from the community, and the rest has to be made up with corporate support.
Just because the code is up on GitHub does not mean that it's "free". And we are willing to pay for it neither in developer time nor in hard greenbacks. Then it's only natural that these projects would, over time, wither and die away as they lose the support of the sole developer who was keeping it alive, on his own free time.
(Benevolent) Dictator for Life
Here, I must take a quick detour to make a note on said sole maintainers.
Notwithstanding their ability keep a repository alive (i.e. the ability to read, review, and merge issues and PRs at least semi-regularly), we must address another aspect of relying on one person, one bottleneck, one gatekeeper for keeping a project alive - their whims.
As I have alluded to earlier, some individuals actually trip on power as small as "being the owner of a library that a few people depend on". And as the library grows, that problem only gets worse.
There is a term called "Benevolent Dictator for Life" (BDFLs) in software engineering to specifically refer to these people who are the sole bus factor in projects that many others depend on (think Guido and Python). The idea is that, we rely on their "benevolence" to keep the project alive.
But of course, as history shows (over and over and over again), counting on dictators/monarchs to be "benevolent" always breaks down, at some point or another. It's brought down entire kingdoms and empires that were once considered "immortal", and open source is no exception.
To give a personal example, I love Objection.js the library (I think it is the only sane ORM in the node ecosystem), but its bus factor is not the "benevolent" kind of dictator.
He actively berates people who are trying to contribute to the library, actively disparages people who are trying to bring a point up, and in general, is very toxic and acidic to work with.
I know this because once, I tried to submit a PR to make the library less inefficient (he was iterating over unnecessary elements to find something), and I was met with not one, but two f-bombs.
And after that, every time I discover a bug with the library, every time I find something I can improve, I just shut up and keep it to myself, because I'm so afraid to deal with him, because I'm not willing to subject myself to this tyrannical gatekeeper and lose brain cells and shave years off of my lifespan.
And at almost every point in your dependency tree, you're dealing with gatekeepers - and any one of them could be actively hostile to community support like this guy.
The House of Cards
So I've spoken enough about the "bottlenecks" of open source, but clearly, the sky is not falling upon us, pigs aren't flying, so how is open source even a thing?
Well, imagine a bridge (or a building). It has many joints and parts that all depend on each other, but a single joint failing does not cause it to collapse. It requires a systematic failing (such as jet fuel melting steel beams) for the structure to fall down.
There are backups, leeways, and if one part fails, there are other, strong parts that will hold the structure up. We adhere to strict guidelines (mostly), we inspect the critical parts regularly so that we do not get a cascade of failures that may bring the structure down.
But should the building/bridge be abandoned, we no longer can guarantee its safety - there may be multiple failures in critical joints, leaving one single joint to hold the entire structure up. And if that joint fails, down goes the whole structure.
These "joints" are very reminiscent of the "bus factors" that I have spoke about, and like real-life architecture, in software world, we have interfaces to "hot swap" multiple implementations, and even if we cannot "hot swap" alternatives, that is simply a matter of putting in some work to adapt (akin to maintenance work on real-life architecture).
And that is why some open source ecosystems have managed to stay alive for multiple decades in a world where time passes 10x as faster as "real life".
Key word: some.
And this is where node.js comes in.
Take a look at the "old-timers", such as Java or Ruby. They are both 1. active, and 2. alive, despite tons of legacy cruft that they have to carry around. And they both tackle this fundamental problem in different ways.
Ruby manages to solve this by consolidating its community under one banner - Rails. There are no mainstream Rails alternatives (Sinatra is not a "mainstream" alternative), and yet for Ruby, this "single point of failure" is a point of strength - since everyone in the community (including companies) uses Rails, the community (again, including companies) has to rally together to support it. And by this nature, as long as Rails holds monopoly over the Ruby ecosystem, it will stay active.
Java takes a different approach - there are authoritative, well-laid out, community-developed specs (such as JDBC, Servelets, JPAs, etc). And yet, there are multiple implementations, each supported by a whole slew of different companies (just look at how many "enterprise" Java application servers there are). There is real corporate support on every part of the ecosystem that may be considered bottlenecks, kept alive by corporate dollars and support contracts. And so, it manages to keep the ecosystem alive.
But what about node?
While there are some exceptions (React, for example, is supported by both the corporate overlord that is Facebook and the community that uses it), for the most part, the node.js ecosystem is fragmented. There are 100 alternatives for every library, and none of them are kept alive because they all have bus factor of 1, and don't have real support funded by hard dollars.
Crucially, the main way node keeps itself "alive" is by constantly reinventing the wheel - say that someone writes library X to solve a certain problem. But after a few months, that someone stops maintaining it. So what do we do? Come up with a new library, that solves the exact same thing, but now, this library Y is the "cool new thing" and everyone has to move to it!
Repeat this ad nauseum, and you get the reason why the node ecosystem is so poorly maintained, why I even had to write this post in the first place - the community never gets enough time to rally behind something, and even if they do, it's still maintained by "that one guy" who's not getting paid to support it.
For open source, the endgame is always either widespread community support (e.g. Rails) or corporate support (e.g. Spring/.NET MVC). Node.js ecosystem has neither, because it is too fragmented, too spread out with too many single points of failure, and so dependent on independent developers who's doing this on their free time.
Now, Ruby/Java world is not perfect. They do have their share of cobwebs as well, but you can at least be confident that the parts that matter are well-supported. We (node.js developers) cannot say the same: the libraries that everyone depends on - express, passport, the ORMs, hell, even the database connectors (
mysql2) - all are either horribly out-of-date or is constantly being reinvented.
But thanks to the "cool new thing" syndrome, the node ecosystem is a house of cards, propped up by weekend projects all the way down. Buildings and bridges can have multiple failures and remain standing up. A house of cards can't.
And I don't see such a systemic failure being fixed anytime soon unless it manages to capture real, corporate support (which it hasn't so far, and has no indication of doing so), because there is no solving the fragmented nature of it, no way to consolidate the weekend projects.