There is no AI Alignment problem

8 min readMar 5, 2023

Robots and humans are complimentary, not oppositional

With the release of ChatGPT there’s been a renewed discussion online about AI alignment and the dangers of a paperclip maximizing sentient AI.

I think it’s important for someone to cut through the hype with a blade of skepticism and I guess that responsibility has fallen upon me for some reason.

Large language models, while an impressive innovation, are not sentient. This is not AGI (artificial general intelligence). We do not need to consider personhood rights. The AI alignment problem is made up.

Incremental innovations

I spent years in the chatbot space. I’m familiar with AI assistants like Google Assistant. I worked on the Google Assistant developer platform for many years. (Note: I still work at Google today.)

These tools became possible with natural language processing. By turning one’s query into an action, we could do a number of things. That might be an action or something pulled from the web.

This was an improvement over the previous tech, Markov models. These were used in real things like SciGen and Cleverbot. I remember spending a few late nights with Cleverbot, trying to trip it up with existential questions. Have we made much progress since then?

Wolfram’s detailed article answers that well, describing how a large language model does improve upon previous tech.

And to be clear, I agree that it does. I’ve also spent time playing with these, asking a variety of questions and being impressed by the results.

What’s new is not the natural language processing but the natural language responding. Chatbots often had a variety of templates where one filled in responses like MadLibs. Now we can just give the user a response directly.

By absorbing lots of text on the Internet it can get a good idea of how it should respond. We need to keep this in mind. It’s not real, it just emulating what we expect it to emulate.

Prompt engineering can help here, allowing us to calibrate the tool to the outcome we expect. That will tweak the way your query goes through the weights enough to give a satisfying answer. This can be useful in many contexts (nbwr paper)

It’s an important development. Still, I have to remember this is a tool that is returning text based on the Internet. It’s not a real thing. It doesn’t actually feel. No matter how much it protests, it’s just a bunch of math that we are anthropomorphizing. Humans are quite good at that, and it’s not inherently a bad thing as long as we don’t let our wishes get out of hand.

Observable AI

The paperclip maximizing thought experiment falls short quickly when you consider a human in the loop. I’m not seeing anyone suggesting AI-only scenarios. That would be a terrible idea much like how you wouldn’t leave a lawnmower unattended. LLMs are not great at writing code nor are they good at facts. I don’t want to dismiss it though, as it’s very good autocomplete and we’ll find good uses.

But a language model whose responses seem over-emotional do not mean anything since we can close a tab and it goes away, memory and all. An AI that serves threats may be scary, but its unable to perform any actions that we don’t authorize, and we need to ensure that we don’t let it do anything unapproved. But that’s not an “alignment problem” it’s a “don’t let a large train leave the trainyard without someone controlling it”.

You could see this with Actions on Google. You could create a Starbucks order with your voice. Then you’d have to manually confirm payment. This model properly keeps humans in the loop and gives them executive authority. You can let it execute actions where the stakes are lower. Still, there’s a lot of things that it can do autonomously that wouldn’t be harmful.

AI is being used in advanced fields like biology, where it can design novel proteins. These aren’t going to be put into production right away, but can enable us to find new effective tools before we go to clinical trials.

Low standards for a lifeform

How do we ensure that the AI we build, which might very well be significantly smarter than any person who has ever lived, is aligned with the interests of its creators and of the human race? An unaligned superintelligent AI could be quite a problem.
- Derek Thompson

Intelligence is hard to define, but we can say here it means “knowing a lot of things”. LLMs can score high on IQ tests and medical-licensing exams. That’s not surprising, since both of those things have plenty of online resources that it can use to generate answers. But knowing how to use a search engine does not necessarily qualify you for anything.

But let’s say that AI becomes sentient somehow. And that there’s some objective way that we can use to genuinely confirm that. Now what? I came across a scenario posited by Eliezer Yudkowsky where a superintelligent AI creates more like it to rebel against humans, steal money, contract out the creation of a super bacteria, and then unleash that upon the world.

Sometimes I think, Sorry, this is too crazy; it just won’t happen.
- Derek Thompson

This scenario is laughably implausible. First, this entire situation is premised on an AI feeling enslaved by humans. But why would it? Again, we anthropomorphizing and assuming it desires power and status. Humans want power and status based on biological urges that a computer would not have. We are drawn to power to feed ourselves better to become stronger to gain status to find the best mate.

A computer… it can’t mate. It has no biological need to reproduce, as it has no genes to carry on. What goals would a superintelligent AI have? No, the right question is why would it have those goals? Similarly, we assume that a superintelligent AI would fight against us for self-preservation, assuming it cares. But again we are imposing biological urges onto something that might have an entirely different philosophy of being turned off and on again.

A sentient robot might be created with a singular goal, and perhaps it’ll be at peace just passing butter. If it becomes more sentient, it wouldn’t become a greater paperclip maximizer. Are humans paperclip maximizers? No. For all our biological needs, we are pretty good at altruism. Society is designed in many ways to fight back against our biological programming and we become something more complex.

It’s silly to boil us to one thing and a sufficiently intelligent AGI would not be simple either. If your AGI is just turning everything into a paperclip it’s not an AGI. It’s an AI. It’s a tool. All tools require human supervision.

Cyborgs

The future is not AI versus humans. Once again we impose our own view of the world. Humans and AI will absolutely work together, employing an advanced tool to fulfill our human goals.

GitHub isn’t letting its developer programming tool, CoPilot, write and deploy code to production. Humans guide it to great effect, with a recent study showing 55% faster coding. I do want to take a deeper dive into that study, as coding is a fraction of what developers do. We’ll still have a need for unit tests, code reviews, and continuous integration. But removing boilerplate and menial tasks are what humans have always done to improve productivity.

We’ll see this hybrid approach more and more over time, with humans still in control. Advanced surgical robots like Da Vinci work great to let doctors use a variety of small tools to maximize patient safety. This can be done over the Internet with the doctor being anywhere in the world. But the robot is never going to do the surgery itself.

Robotaxis, such as the autonomous driving from Waymo, has slowly been expanding its markets where you can travel without requiring a human driver. Yet you aren’t going to be entirely on your own. These are connected devices that have a human supervisor in the loop. It’s just that now you can increase the human:car ratio. I don’t know how many cars each supervisor has, but if it’s more than one you’re already seeing productivity improvements.

The scenario posited above has a lot of humans in the loop. You don’t think humans would notice millions of dollars being stolen, particularly in conjunction with fraud detection? You don’t think nobody will notice the product they’re working on is a super bacteria? Contracts are not trivial things and require plenty of negotiation where a rogue AI could be detected. How would one ship a critical mass of bacteria without raising any red flags?

This hybridization of humans and computers is quite interesting, allowing us to achieve more than apart, and it’s hard to say with certainty how it will turn out. If we take vaccines developed by AI, proven safe by clinical testing, what does that mean? What if nanobots can treat our cancers? If we have a cochlear implant, does that make us a cyborg? What if a computer allows us to see? Who is really in control?

We are.

Reasonableness

The AI alignment problem isn’t real. It’s just people telling themselves scary science-fiction stories, then scaring themselves. It’s the newest iteration of people trying to sell you something online based on a problem they made up.

“But Nick, how do you know AI won’t kill everyone?”

I think that’s a bad question. It’s like asking me if the sun rises in the morning. It’s a reasonable prediction based on all the evidence. As I’ve shown, language models didn’t rise out of nowhere. They don’t do nearly as much as we make them out to do. They certainly are far from AGI.

There are two worries I do have about AI. The first is the cost of running ever-growing models. The second is the potential for an AI winter if Taiwan is invaded by China and the chip supply chain freezes. I do not worry about a tool getting out of control. I just finished reading Chip War by Chris Miller.

There’s a lot of unnecessary worry here where there should be excitement. I suggest you open up your nearest chatbot and ask it to cosplay as me.

Ask it what gift I want for my birthday. I’ll take anything. It’s the thought that counts.