Is open-source software ethical?
I had this idea mulling in the back of my head for a few weeks but I didn’t know exactly how I wanted to write about it. Then over the weekend, we learned of a major vulnerability in Linux which had been detected and mitigated. The “xz” project had introduced a vulnerability that would’ve allowed for Linux computers to be taken over by a malicious actor.
This was not an accident. From what people have gathered over the last few days, the account JiaT75 took over the project several years ago and slowly built a series of commits and fixes in order to gather a positive reputation. They then exploited this trust to introduce malicious changes and then push to get these changes upstreamed into stable versions of Linux.
The true identity of this person is unknown. Some speculate they were a state actor playing a long con. It might’ve been an attempt to compromise low-security IoT devices and create a silent botnet. We may never know for sure.
Thankfully, this happened to be caught a few days ago by Andres Freund and brought to the attention of the community before it reached stable versions. If he hadn’t, there may have been disruptions across every sector of society. Computers are in basically everything these days, from your car to your microwave.
How did this happen in the first place? The original maintainer of the library, Lasse Collin, developed burnout after being inundated with demands without enough bandwidth to deal with all of them. They slowly handed over the ropes to Jia Tan, who then was able to prioritize their own malicious goals.
In December 2021 I wrote an essay Don’t trust open-source software. It’s inherently insecure. Just as today, I had a topic I was mulling over when a massive vulnerability caused me to shift focus at the last minute. Back then, it was log4j which was causing lots of headaches. Why does this keep happening?
The software industry is expected to import code from other projects into their own from strangers on the Internet. Anyone is able to send code changes to these projects, which then can get merged without a full accounting of trust. Maintainers burn out, projects get stale, and then are resurrected by strangers without everyone in the software supply chain being informed.
This is hardly news. It happens again and again. Reading through my essay from years ago, it’s surprising how relevant it still feels. Clearly this whole house of cards is on the verge of toppling.
So this leads me to ask the question: is it even ethical to work on open-source software?
Ethics & Politics
I had begun thinking about this topic over a month ago, after the White House released a statement requesting future software to be memory-safe.
Remember the Colonial Pipeline hack? At that point, the White House understood the urgent need for secure software, and they have been pushing efforts to create a better software supply chain and ensuring code providence.
When a small Node.js project pulls in hundreds of libraries, it becomes impossible for an individual to read through and understand every single line of what they’re running. Yet any single line could result in multi-million dollar damages.
The “wild west” of trusting any random person online and running their code is becoming more precarious. There are more vulnerabilities coming out and it often feels like a losing battle. I am very grateful for Andres who found this vulnerability in his spare time. But if our plan for security is relying on random volunteers, we’re not going to win.
A recent law paper surmises a licensing regime, requiring engineers to have a certification to work on AI models and applications. This could on top of a licensing regime for any kind of software. The EU is pushing hard for more regulation.
The politics of software are getting harder to avoid. Imagine if the FBI broke down your door because you just setup a Raspberry Pi with a default password and is now part of a botnet DDOSing a water treatment plant.
Imagine if there was a police file on you because you published a small Arduino project written in C. If you know it’s not memory-safe, and you aren’t using all the necessary techniques to secure your code, is it ethical to publish the code?
If we don’t get this right, we aren’t immune to the consequences.
Can you imagine a world where you couldn’t just open up a text editor and write some HTML? Where teenagers wouldn’t be allowed to code until taking a test proving they knew proper rules? Where governments monitored all of your work, at work and outside it, to ensure everything follows regulations?
I think that would be a bad future. However, the trade-offs between side project freedom and mitigating societal consequences are getting harder to balance.
Ethics & Engineering
People want security. That often comes in the form of physical safety, advocating for more police funding and living in expensive suburban homes to build a moat between them and the world.
Engineers have a responsibility to follow best practices, even if that isn’t legally mandated. We need to be building things up to standards, which in the software realm include readable, reliable, tested, and secure code.
It’s clear we’re not able to do that. The xz developer failed to live up to engineering excellence by handing over ownership of the project to a random person. The community didn’t seem to notice or care at the time. Bigger projects that pull in these libraries did not do enough red teaming to catch it.
If I boot up my Ubuntu computer, do I know every piece of software running on it? Do I know every single developer involved in writing all of that code? Despite some platitudes about software supply chains, I haven’t seen much progress on actually providing good tools to monitor and respond to issues in our software packages.
Can I even trust these developers? Volunteers from around the world could include many nefarious people. Writing closed-source code for a company isn’t necessarily more secure, but there can be better engineering practices around code reviews, testing, and quality assurance. Beyond that, it’s a lot harder for a nefarious actor to pass the hiring bar at a company compared to just sending a pull request.
Is it fair to blame Lasse Collin? As an all-volunteer project, can anyone be considered liable for problems that could’ve been caused? I think most developers would defend Lasse here. Can anyone be expected to do a bunch of urgent work for free?
No, of course not.
But then who would be liable? Where does the buck stop? Who is the one who is pushing for engineering excellence? Who is the one who is going to enforce it? Most open-source projects do not have strong management structures because they’re staffed only by volunteers.
That’s not very reassuring for those who use these open-source projects for commercial products, as the impact of bad code can be catastrophic. As code gets integrated in more places, as people depend on computers more, it becomes increasingly important that we do things well.
If we can’t trust open-source, we’ll stop using it. Companies may feel forced to stop depending on them, and replace them with their own versions. This ruins the cooperative nature of open-source and standards development. It puts everyone into silos.
Yet our current models of open-source work encourage bad engineering because there’s no mechanism to enforce better standards.
If you’re a civil engineer, you need to pass a professional engineering test to be allowed to design things. This makes sense. If someone designs a tunnel poorly, there are real-world consequences for the people who use it. If you’re a software engineer and you design bad code or follow bad engineering practices, there are still real-world consequences.
Ethics & Industry Practices
The industry needs a better model. Open-source burnout keeps happening, and relying on volunteers for free labor is simply inadequate for engineering good products.
I think any attempts at reform will rustle a lot of feathers. Similar to how most Uber drivers prefer being independent contractors, I think a lot of open-source developers may prefer the freedom to have side projects that they work on sometimes, without the expectation that it’s going to be their full-time job.
But once you put something out there, you cannot be surprised when people use it. If your code’s license is MIT, you shouldn’t be surprised when others take your work and profit from it without paying you.
If you don’t want people using your work, why did you make it open-source to start with? If you aren’t willing to keep up with security and reliability, maybe it’s better that it wasn’t open-source to start with. Is it better for the world if we have fewer open-source projects?
I have written before about the inherent tension in open-source ideals versus the realities of life. If you want to be a leet hacker making cool things on the side in some idealistic post-capitalist society, that’s fine. But then don’t be surprised when reality turns out to be different from what you idealized.
As such, open-source needs to professionalize. I envision an institution with a lightweight leadership team that manages a variety of different project across applications and languages.
This leadership team enforces certain engineering standards but otherwise is hands-off, focusing more on sales and organizational capacity. Each project has a head technical lead that is involved in sales calls and manages the project.
I don’t think projects need middle management or product managers. There may be a few at the application/language level to coordinate platform standards. The organization can hire horizontal floaters who can do UI design, draw icons, and provide security testing.
The institution hires these engineers. They work on these projects full-time, drawing a full salary and benefits. This also means that only a selected set of approved developers can make verified changes and releases.
The institution gets service-level agreements and support contracts from large corporations for high priority work in exchange for a hefty sum. While the releases can still be public and open, the focus will move away from assorted GitHub issues to specific contract work with the agreement of each project’s technical lead.
Coordinating all these projects in one organization, the institution can bundle several projects contracts together for a single cost. This can give greater assurances to companies who have pressing needs and lots of cash.
This organization can negotiate large contracts, manage government relationships, and write purchase orders in ways that individuals are reluctant to do. They can help smooth over different levels of project interest so unpopular project maintainers don’t go underpaid.
This idea is definitely controversial. It dirties the communal simplicity of open-source with money and corporations and mandatory responsibilities. Still, it pays people and gives them stable work. Isn’t that worth compromise? Isn’t that better than the state of things today?
A lot of big companies have open-source teams that coordinate projects while maintaining high levels of engineering excellence. This has already been shown to work. Some organizations like the Linux Foundation do this kind of stuff already, but are fairly selective with the projects they support.
A lot of mid-tier projects would benefit from collectively organizing an economic structure that rewards the actual developers.
Open-source maintainers complain but then don’t do work to improve their situation. The FFMPEG site does not offer a pricing plan and I cannot find any mention of SLOs. There is a donation page. However, I am not allowed to use my corporate card to give them money and I do not have access to the company’s PayPal account.
As far as I can tell, there is no mechanism for them to extract money from corporations for support. They’re letting themselves get kicked around.
Hypocrisy
At this point you may consider me to be a hypocrite.
“Wait a second Nick, you can’t say open-source is bad. You have written extensively on open-source projects you build in your spare time. You’re a hypocrite!”
This is true. I have created a lot of open-source projects and don’t spend much time on maintaining any of them. You can easily go through my GitHub to a project even from last year to find one that requires an old version of Node or depends on a deprecated library.
It’s very alluring to whip up something in your spare time and not fun to spend time on minor upgrades. Dependabot can be useful here, as long as the project integrates with GitHub Actions. Many of mine do not, as that’s another piece of devops that I don’t feel like setting up.
I fully acknowledge there’s a big problem here with incentives. I have no reason to spend my limited time doing something boring because I don’t really have a responsibility to anyone to keep these projects maintained.
Should I have this responsibility? If so, I probably would just stop creating assorted projects. Most of them don’t get much usage outside of me, so it’s not like others are benefiting from the code being public. Maybe that’s a good thing.
At the same time, that doesn’t really address the underlying causes of security vulnerabilities. So it’s worth working towards a positive future where we do more rather than one where we do less.
Conclusion
Software engineers, programmers, and developers need to understand they no longer are in a world of Internet escapism and silly gadgets and hobbies. Computers matter. The Internet matters. What we are building matters. We cannot avoid this fact.
We should be happy. This is the world we wanted. People don’t need to pay extensively for proprietary software packages and operating systems. They can get lots of things for free. They can see how whole operating systems are built. They can modify these systems to their own liking and contribute changes back to the original project.
Yet there are no utopias. In this world, there are still problems. It’s necessary that we cast off idealism and acknowledge that open-source as it happens today is not working. It’s bad for business, bad for customers, and bad for maintainers.
We got lucky this time. But we cannot depend on luck.
As it stands today, I think we can see the system isn’t ethical. It requires work at every level of the software supply chain to create a system that works.
If we can’t do it, others will make the choices for us.