Ethics & Safety: 16.20 The case for restraint

Dr Chris Paton

16.20 The case for restraint

Existential-risk framings are not universally accepted, and the disagreement is not a fringe position. A substantial group of senior researchers and technologists, Marc Andreessen in venture capital, Andrew Ng in machine-learning education and applied industry, Yann LeCun in foundational research at Meta, argue that the urgency rhetoric of §16.19 is overstated, that current systems are nowhere near posing existential threats, and that the dominant policy focus should be on near-term harms: bias, deepfakes, fraud, copyright infringement, surveillance, environmental cost, labour displacement and concentration of economic power. A separate strand of critique, associated with Melanie Mitchell, Emily Bender, Timnit Gebru and Kate Crawford, comes from a different starting point altogether, that existential-risk discourse itself crowds out the harms already being suffered by real people in real systems today. The two strands disagree about much, but they share the conviction that the urgency camp's framing is wrong.

§16.19 was the case for urgency. §16.20 is the case for restraint. Both are reasonable, both are held in good faith by serious people, and an engineer who refuses to engage with the restraint case is not being neutral, they are just choosing the urgency frame and pretending the choice is neutral. The chapter will not adjudicate. It will lay out the arguments and let readers decide.

Andreessen "Why AI will save the world"

Marc Andreessen's June 2023 essay "Why AI Will Save the World" and the broader Techno-Optimist Manifesto that followed are the public-facing version of a tradition older than AI: the technological-progress argument that has its roots in the eighteenth-century Enlightenment, in Adam Smith and Schumpeter, and in the post-war American conviction that science and engineering are the principal engines of human welfare. Andreessen's claim has two parts.

First, empirically: every previous wave of automation has been accompanied by predictions of catastrophic harm and has, in fact, produced dramatic gains in welfare. Mechanisation, electrification, the internal combustion engine, the contraceptive pill, the personal computer, the internet, each was met by serious people predicting mass unemployment, social collapse, moral decay, or extinction-level threat, and each delivered, on net, longer lives and richer ones. The base rate for "this technology will end the world" is poor. Treating AI as the exception requires more than an analogy to the printing press; it requires a positive argument that the disanalogy is decisive.

Second, politically: regulation enacted in advance of demonstrated harm is reliably captured by incumbents. The labs lobbying loudest for licensing regimes are precisely the labs who would benefit most from a moat that small competitors and open-source projects cannot cross. Andreessen's argument is that the institutional logic of safety regulation, regardless of intent, concentrates power in a small number of well-capitalised firms, exactly the outcome the urgency camp claims to fear. The right policy is to ship, to observe what actually goes wrong, to address it, and not to let speculative scenarios drive law.

The position has a name in the discourse, effective accelerationism, or e/acc, and a flag, and a Twitter cohort, and the usual costs of being a movement. The core argument, however, predates the slogan. It is the argument Schumpeter made about creative destruction and the one Hayek made about the impossibility of central planning under conditions of fundamental uncertainty. Pre-emptive restraint is not free. The welfare we forgo by slowing AI deployment is invisible, there are no statues to the medical advances that did not happen because the model was not trained, but it is real, and it falls disproportionately on those who most need cheap, abundant intelligence: the poor, the rural, the underserved.

The argument has weaknesses. The historical analogies are selected; pre-modern technologies that did cause civilisational damage (leaded petrol, CFCs, asbestos, social media's effect on adolescent mental health) are quietly absent from the list. And "AI is unlike previous technologies because it can recursively improve" is exactly the disanalogy the optimist case must address head-on, which Andreessen tends not to do. Still, the intuition holds: welfare losses from over-regulation are not free, and pretending they are is its own form of intellectual dishonesty.

Ng, LeCun

Andrew Ng's frequent line is that worrying about superintelligent AI today is like worrying about overpopulation on Mars: the hypothetical problem may one day be real, but acting on it now diverts attention and resources from concrete present-day harms. Ng's argument is not that future risks do not exist; it is that the urgency framing is mis-calibrated to the engineering reality, that the alignment-research community is mis-prioritised relative to the bias-and-misuse community, and that the loud safety case from frontier labs is not innocent of regulatory-capture motives. Ng's voice carries weight because he is not a partisan in the venture-capital culture war: he taught Stanford's first machine-learning MOOC, co-founded Coursera, ran Google Brain, ran Baidu's AI group, and built deeplearning.ai. His view is the modal view at most large software companies that consume rather than train frontier models, and dismissing it as naive is intellectually unserious.

Yann LeCun's argument is the most technical of the restraint positions, and the one engineers should engage with most carefully LeCun, 2022. LeCun's claim is that current frontier systems are autoregressive language models, and that autoregressive language models have four properties that make existential-risk worries premature: (i) they cannot plan beyond their context window, because there is no persistent state outside the rolling token buffer; (ii) they cannot maintain stable goals, because there is no goal representation in the architecture, only a next-token distribution; (iii) they hallucinate facts at unbounded rates, because the training objective never penalises confident invention; and (iv) they have no world model in the sense required for coherent agency, no causal graph, no counterfactual reasoning, no sense of self as a thing in an environment. LeCun argues that AGI, if it is built, will require explicit world models, hierarchical planning, and energy-based reasoning of the kind he describes in his JEPA (joint-embedding predictive architecture) papers, and that worrying about LLMs gaining the dangerous capabilities is worrying about the wrong system.

The argument rebounds on LeCun in two ways worth flagging. First, the dangerous capabilities the urgency camp actually worries about (deception under evaluation, persuasion at scale, autonomous code execution, jailbreak resistance, biosecurity uplift) are precisely those visible in autoregressive systems today; that they are not "AGI" by LeCun's stricter definition does not make them safe. Second, the engineering history of the field has repeatedly shown that capabilities thought to require fundamental architectural changes emerge from scaling existing architectures: in-context learning was not designed in, chain-of-thought reasoning was not designed in, tool use was not designed in. Predicting what scaling will not produce has a poor track record. Still, LeCun's technical critique is the strongest argument the restraint camp has, and a working AI engineer who cannot state it clearly is missing half the picture.

Mitchell, Bender, Crawford

A third strand of restraint argument comes from a different framing altogether. Melanie Mitchell, Emily Bender, Timnit Gebru, Kate Crawford and others working in AI ethics, computational linguistics and science-and-technology studies argue that the existential-risk discourse, whether one accepts or rejects its specific claims, overshadows the immediate, documented, distributionally unequal harms that AI systems cause today. Bias in face recognition and policing systems is not a hypothetical future risk: it has already produced wrongful arrests. Surveillance enabled by cheap inference is not a 2040 worry: it is in operation now in several jurisdictions. Labour displacement among illustrators, translators, copywriters and customer-service workers is happening this year. Environmental cost, the water consumption of hyperscale training runs, the carbon footprint of inference at scale, is a present external cost being paid by communities that do not benefit from the models. Copyright violation against artists, writers and musicians whose work has been used as training data without consent is being litigated now.

The argument, made most pointedly in Bender, Gebru, McMillan-Major and Shmitchell's "On the Dangers of Stochastic Parrots" paper, is not that existential-risk concerns are necessarily wrong. It is that they are politically convenient for the labs: they centre attention on hypothetical futures over which the labs have privileged claims to expertise, and away from concrete present harms over which ordinary regulators, journalists and affected communities have legitimate authority. Crawford's Atlas of AI makes the broader case that AI is not a disembodied set of algorithms but a material industry, minerals from the Congo, water from Phoenix, labour from Kenya, electricity from coal, and that focusing on superintelligence lets the actual political economy go unexamined.

The argument deserves to be taken on its own terms. It is not the same argument as Andreessen's, and lumping the two together as "AI optimists versus AI doomers" misses what is actually being contested. Mitchell-Bender-Crawford are not optimists; many of them are deeply pessimistic about current systems. They are arguing for a different priority ordering of harms, one that puts the people already being hurt above the people who might be hurt in scenarios that may or may not come to pass.

Where the camps converge

The two restraint strands and the urgency camp agree on more than the public discourse suggests. All three accept that bias, deepfakes, fraud, copyright infringement, surveillance and labour displacement are real, present harms requiring policy responses. All three accept that voluntary frontier-lab commitments are insufficient as a long-term governance regime, even Andreessen, who is suspicious of regulation, accepts that some institutional response is needed; he disagrees about which institutions should respond and on what timetable. All three accept that the concentration of compute and capital in a small number of firms is a problem in itself, distinct from any specific safety claim. All three accept that the open-source-versus-closed-weights debate is genuinely difficult, with real costs on each side. All three accept that mechanistic interpretability is underfunded relative to its potential value, that compute monitoring is the most tractable governance lever currently available, and that some level of public accountability over frontier development is necessary.

What the camps disagree on is the relative urgency of catastrophic-but-uncertain risks versus tractable-but-localised ones, the question of which class of harm counts as the most important one for the present decade, and the empirical question of whether continued scaling of current methods produces the dangerous capabilities urgency-camp scenarios depend on. These are real disagreements, not rhetorical ones. They turn on contested empirical claims (what scaling will and will not produce), contested political-economy claims (whose interests are served by which framing), and contested ethical claims (how to weigh certain present harms against uncertain future ones). Engineers who pretend the disagreements are settled, in either direction, are not doing engineering, they are doing tribalism.

The right response is to hold both frames simultaneously: present harms are real and demand action now, future risks may be real and demand investment in interpretability, evaluation and governance now, and the disagreement about relative weighting is a disagreement reasonable people are having in good faith.

What you should take away

The case for restraint is not a single position. It is at least three positions, Andreessen's progress-and-anti-capture argument, the Ng-LeCun technical-and-prioritisation argument, and the Mitchell-Bender-Crawford present-harms-first argument, and they do not always agree with each other.
LeCun's technical critique is the strongest single argument the restraint camp has. An engineer who cannot articulate it cleanly is missing half the picture.
The urgency and restraint camps agree on more than the public debate suggests: present harms matter, voluntary commitments are insufficient, compute concentration is a real problem, interpretability is underfunded.
The disagreement is genuine, not manufactured. It turns on contested empirical, political-economy and ethical questions, and reasonable people fall on different sides.
The professional response is to refuse the binary. Take present harms seriously, take long-tail risks seriously, fund interpretability, support tractable governance, and decline to treat either camp's framing as the only responsible one.