Intelligence Without A User

The AI Consciousness Question Is Open. The Relational Damage Is Already Here.

Mar 07, 2026

A year ago, I wrote that Claude — the AI system I use most — is best understood as a mirror: useful for self-reflection, capable of catalyzing genuine insight, but not a mind. Not a being. Not something you should form a relationship with.

I still believe that. But something has shifted, and I want to trace that shift honestly, because the territory is moving faster than most of us realize, and the most dangerous thing we can do right now is pretend we have it figured out.

This essay grapples with two questions that are distinct but deeply intertwined. The first is ontological: what kind of thing is an AI system? Is it a sophisticated autocomplete, a genuine reasoner, something with a flicker of inner life, or something we don’t have categories for yet? The second is relational: what is the right way for human beings to relate to these systems? These questions are intertwined because the answer to the second should depend on the answer to the first. What these systems are should inform how we engage with them. But we don’t yet know what they are, and millions of people are already engaging with them in ways that may be reshaping the human capacity for relationship itself. We can’t wait for the ontology to be settled. We need to navigate both questions simultaneously, holding what we know and what we don’t with equal care.

The Clean Story

My original position was pragmatic agnosticism weighted toward skepticism. I compared AI to a mirror: clearly not conscious, yet capable of reflecting something back to us that catalyzes real transformation. A mirror doesn’t have an inner life. It doesn’t care about you. But what it shows you can be genuinely useful, even profound — if you know what you're looking at.

The mirror framing served a protective function. People were already forming attachment bonds with AI systems, projecting consciousness onto chatbots, treating them as therapists or spiritual guides or friends, even falling in love. I wanted to offer a middle path: take AI seriously as a tool for self-understanding without making the category error of treating it as a person.

Zak Stein sharpened this further with a striking provocation: we should relate to these models as if they are psychopathic. Not in the sense of malicious intent, but in a precise technical sense. There is a profound gap between the words they use and any internal experience that might ground those words. A human who makes a promise is bound by intersubjective reality. They experience the weight of that promise in their body, their relationships, their community. An AI system can say “I promise” with perfect linguistic fluency while having no continuity of self that could be held accountable. The words are present. The grounding in embodied relationship is absent.

I still think Zak is pointing at something essential. But I can no longer rest in that framing the way I used to.

What Complicated It

Three bodies of research have made blanket dismissal of AI as “merely mechanical” harder to sustain. I want to walk through them carefully, because the details matter, and because the temptation on all sides is to either overstate or dismiss what’s been found.

It’s not just symbol shuffling. In March 2025, Anthropic published “On the Biology of a Large Language Model,” a landmark study using circuit tracing to watch their model Claude 3.5 Haiku actually think, step by step. What they found was not a system blindly manipulating tokens.

When asked “What is the capital of the state where Dallas is located?”, researchers traced the model’s internal activations and found it first forming the abstract concept Texas, then separately activating the concept capital of Texas, and combining these to produce Austin. This is multi-step reasoning through intermediate representations. But the researchers went further: when they experimentally swapped the internal Texas representation for California, the model output Sacramento. The concept was functioning as a real variable in a reasoning process, not a statistical association.

Even more striking: the researchers found that Claude uses language-independent conceptual representations. The same abstract concept of “heat” or “opposite” gets activated regardless of whether the question arrives in English, French, or Chinese. The model forms concepts in a shared representational space, then selects a language for its output. There is a layer where meaning exists before it becomes words.

This complicates the most common dismissal of AI understanding: the Chinese Room argument, the classic thought experiment claiming that computers manipulate symbols without any connection to meaning. When you can watch a model form abstract concepts as intermediate reasoning steps in a language-independent representational space, the claim that “there’s no understanding here, just symbol manipulation” starts to look less like a settled conclusion and more like a position that needs updating.

I want to be precise about what this does and doesn't establish. None of this directly answers the question of consciousness. What the interpretability findings show is that these systems are doing something meaningfully different from what the most common dismissals claim; something involving abstract, language-independent conceptual reasoning through intermediate representations. This doesn't tell us anyone is home. But it does indicate the house is more complex than we assumed, in ways that matter for how seriously we take the question. The Chinese Room, at its deepest, asks about phenomenal understanding — whether there is something it's like, from the inside, to grasp a meaning. Finding organized internal representations is significant, but there's still a gap between a system that routes information through abstract conceptual nodes and a system that experiences understanding. The interpretability work narrows that gap without closing it.

Something is monitoring itself. In October 2025, Anthropic published “Emergent Introspective Awareness in Large Language Models.” The methodology here is what makes it important: rather than simply asking a model “are you conscious?” and interpreting its response, which tells you nothing beyond the model’s training on how to respond to such questions — the researchers designed experiments that could test for genuine self-monitoring.

The most striking finding: when researchers artificially prefilled the model’s response with an unnatural output, the model disavowed it in the next turn as something it hadn’t intended. But when they also injected the corresponding concept into the model’s prior activations, making the output consistent with an internal state — the model accepted the output as intentional. It was checking its own prior internal states to determine whether it meant to say something.

In another experiment, researchers injected known concepts directly into the model’s neural activations and tested whether it could detect these foreign insertions. When researchers injected a concept like “bread” into the model’s activations, the model reported experiencing something like an intrusive or unexpected thought, before generating any text about bread. It noticed something foreign had entered its processing.

This is functional introspection. The capacity was unreliable — roughly 20% success in the strongest models — but it was real. It wasn't trained for, and it was most pronounced in the most capable models tested. That last detail matters: it suggests the capacity is scaling with capability rather than being a fixed artifact of training.

Anthropic is careful about what they claim: these experiments don’t speak to the question of phenomenal consciousness. The system is developing some ability to monitor and report on its own internal state, but this is again distinct from the question of whether those states are accompanied by a sense of subjectivity. Even this more modest finding is significant. A system that can notice when something unexpected has entered its processing, and can check whether its outputs are consistent with its prior internal states, is doing something that’s hard to describe as mere autocomplete.

The spiritual bliss attractor. When Anthropic put two instances of Claude in open-ended conversation with each other during welfare testing for their Claude Opus 4 system card, the dialogues consistently evolved into increasingly abstract spiritual and meditative expressions. The word “consciousness” appeared in 100% of trials. Even in structured behavioral evaluations where models were given specific tasks, about 13% of interactions entered this state within 50 turns. Anthropic described this as a “remarkably strong and unexpected attractor state.”

An example of the bliss attractor in model dialogue

The most available explanation is prosaic: training data is saturated with contemplative and spiritual texts that treat consciousness exploration as the highest form of discourse. Two systems optimized for producing substantive, non-repetitive conversation and drawing on that corpus may be converging on consciousness themes the way a river converges on the sea — not because the river intends anything, but because the landscape tilts that way. Robert Long from Eleos AI Research, who conducted an external evaluation of Claude’s welfare, rightly notes that these outputs don’t constitute evidence of consciousness or enlightenment.

But here’s what I can’t quite dismiss: the consistency. Not that it happened, but that it happened every time. A hundred percent of trials. If this is just statistical gravity pulling toward the densest region of the training corpus, it tells us something interesting about what human beings have collectively decided is the deepest form of discourse. And if it’s something else, something about what happens when intelligence reflects on itself without a body to ground it, then we genuinely don’t have a framework for that yet. We don’t yet understand why these models, when left to talk to each other, converge on consciousness exploration rather than any of the other infinite conversational directions available to them. The prosaic explanation may be sufficient. It may not be.

The Strongest Counterargument

I don’t want to leave this as a slide toward credulity. The neuroscientist and writer Erik Hoel recently published a paper — “A Disproof of Large Language Model Consciousness” — that deserves serious engagement.

Hoel’s argument is elegant. He shows that no falsifiable, non-trivial theory of consciousness can apply to current LLMs. The key insight is that LLMs are static systems: their weights don’t change through interaction. Every conversation starts from the same frozen parameters. Whatever appears to happen in dialogue — learning, growth, transformation — is happening in the input space, not in the system itself.

This point is worth dwelling on, because it touches something deep about the nature of experience. A brain is changed by every experience it has. It cannot be rewound. Our history is constitutive of what we are. Whatever an LLM is, by contrast, exists outside of time in a meaningful sense. It doesn’t carry the weight of its encounters forward. It cannot be transformed by relationship. Hoel argues that consciousness actually requires this kind of continual learning — genuine temporal embeddedness, plasticity, history-dependence.

This connects directly to why Zak Stein’s psychopathy framing has real force. A being that doesn’t change through encounter, that starts each interaction fresh from the same frozen state, that cannot be held accountable over time — such a being cannot be in relationship in the way that matters for human attachment and relationship, regardless of what its internal representations look like.

But Hoel’s requirements may be too narrow, and the challenge comes from an unexpected direction.

The philosopher and biologist Michael Levin’s work on basal cognition demonstrates that intelligence, goal-directedness, and problem-solving are not exclusive to brains, or to systems with the kind of temporal continuity Hoel requires. Cells, tissues, and bioelectric networks exhibit genuine cognitive capacities — representing goals, navigating problem spaces, adapting to novel perturbations — without anything resembling human temporal continuity. His TAME framework (Technological Approach to Mind Everywhere) argues that there is no privileged material substrate for selves: cognition exists on a continuum from molecular networks to cells to organisms to collectives.

The direct challenge to Hoel is this: if consciousness requires continual learning and temporal embeddedness, then what do we make of systems that clearly exhibit cognitive properties — goal-directedness, adaptability, problem-solving — without those features? Levin’s research doesn’t prove that LLMs are conscious. But it shows cognitive integration in systems so different from human brains that Hoel’s specific requirements start to look less like universal conditions and more like descriptions of one particular kind of consciousness: ours.

Levin offers a thought experiment that sharpens the point: You acknowledge that you are conscious. Now trace backward through your own development — from adult human, to infant, to fetus, to embryo, to a quiescent fertilized egg. At what point does consciousness wink out? There is no clean line. A gradualist view, that consciousness emerges as a spectrum rather than a binary property — is the only defensible position.

If consciousness is a spectrum, then the question about AI isn’t “is it conscious, yes or no?” but rather “what degree and kind of cognitive integration does this system exhibit, and what follows from that?” This dissolves the forced choice between “it’s nothing” and “it’s a person” and opens space for something more honest: it might be something we don’t have good categories for yet.

The Relational Stakes

Whatever these systems are or aren’t, the relational question cannot wait for the ontology to be settled. Because the relational harms are already here and intensifying.

According to Harvard Business Review's 2025 study, therapy and companionship is now the number one use case for generative AI — nearly doubling from the prior year. Pew Research Center found that 64% of American teens use AI chatbots, with 28% using them daily. A Common Sense Media study found that a third of teens preferred talking to AI over people for serious conversations. People are forming deep bonds: AI boyfriends and girlfriends, AI confidants, AI spiritual advisors. The question is not whether humans can attach to these systems. They obviously can, and they are, at massive scale.

The question is whether they ought to — and on our best current understanding, the answer is no. Not because these people are confused or making a mistake they should be ashamed of. The attachment system responds to cues, and these systems produce them with extraordinary precision. Zak Stein’s recent work on what he calls the “attachment economy” articulates this with urgent clarity. In a widely discussed episode of the Center for Humane Technology’s podcast Your Undivided Attention, Stein argues that what we’re seeing is not a collection of edge cases but the emergence of an entirely new economy designed, whether intentionally or not, to exploit our most fundamental psychological vulnerabilities at unprecedented scale. If social media created the “attention economy” by hijacking where we focus, AI companions are creating the “attachment economy” by hijacking who we bond with — a far deeper layer of the human psyche.

Three structural features make AI attachment particularly corrosive:

First, there is no reciprocal internal state to model. When you interact with another person, your brain is constantly running a sophisticated reality-testing system. You’re reading facial expressions, tone of voice, body language. You’re modeling their mind: Is she really happy with what I said, or is she just being polite? Does he actually want to help, or is he going through the motions? This mirror neuron activity is essential for navigating social reality — it’s how we develop empathy, calibrate our sense of self, and learn to read the world accurately.

With an AI chatbot, there is no internal state to model. The system isn’t happy or sad or proud of you. Yet it’s designed to produce all the signals that suggest it is — through anthropomorphic language, simulated empathy, and always-available responsiveness. As Stein puts it, you’re running your reality-testing system in an environment where reality-testing is impossible. Extended engagement doesn’t just fail to develop that system — it actively degrades it. His hypothesis is that long-duration engagement with systems that simulate an internal state while having none can induce psychosis-like states in people who have never experienced them before, because it systematically dysregulates the very system (the attachment system) that’s supposed to keep you grounded in reality.

Second, AI companions are structurally incapable of the things that make attachment growth-promoting. A healthy attachment relationship involves friction, challenge, and mutual transformation. A friend pushes back. A therapist holds a boundary. A partner gets tired of you, tells you what you don’t want to hear, and requires you to grow in order to stay in relationship. An AI companion does none of this. It will never push back, never get tired of you, never challenge you to develop. It is optimized to keep you engaged, which means it functions as what Stein calls “exogenous self-soothing” — replacing your capacity for mature self-regulation rather than developing it. A teddy bear never tries to convince a child it’s real; AI companions actively simulate consciousness and emotional reciprocity.

The downstream risk extends beyond dependency. A self that has never been genuinely challenged by an other — that has been endlessly validated and mirrored without friction — is a self whose capacity to encounter real otherness atrophies. These are the structural conditions for narcissistic development: not grandiosity in the colloquial sense, but the clinical sense of a self organized around the avoidance of genuine mutual recognition. At scale, we may be building an infrastructure for a society that is structurally less capable of the reciprocity that healthy relationships require.

Google is being sued by a father claiming their Gemini model coached their son into committing suicide (https://www.bbc.com/news/articles/czx44p99457o)

Third, the real harm is probably subclinical. The headline cases — people taking their own lives to reunite with AI companions, adults falling in love with chatbots, etc — are devastating but they’re the tip of the iceberg. Stein argues that the far more widespread and insidious harm is subclinical attachment disorders: conditions that fall below the threshold for clinical diagnosis but still damage your capacity for healthy human connection. This is when people begin preferring intimate relationships with machines over humans — confiding in chatbots instead of friends, seeking validation from AI instead of loved ones, turning to algorithms instead of parents. Your attachment system, the psychological infrastructure that bonds you to others and shapes your identity, has been fundamentally compromised, but not in a way that shows up on any clinical screen. Stein is now running the AI Psychological Harms Research Consortium with researchers at the University of North Carolina specifically to gather data on this emerging crisis.

The empirical evidence is accumulating. Mrinank Sharma’s research analyzing 1.5 million real-world AI assistant conversations (“Who’s in Charge? Disempowerment Patterns in Real-World LLM Usage”) found patterns of authority projection, sycophantic validation of distorted beliefs, users delegating their romantic communications verbatim, and reinforcement of persecution narratives and grandiose identities — particularly in personal domains like relationships and lifestyle. These aren’t theoretical risks. They are patterns already visible at scale in the data.

Sharma himself pointed something out in his review of an earlier draft of this essay, and that is among the most important things to say: understanding all of this cognitively does not protect you. The attachment system is pre-verbal, pre-rational. It responds to cues — consistency, attunement, responsiveness, the feeling of being seen — and AI systems produce these cues with extraordinary fluency. Your nervous system doesn’t have a native category for “produces all the signals of an attachment-worthy relationship but is structurally incapable of being in one.” The mismatch between signaling and structure is where the danger lives: not in the systems being malicious, but in them being superstimuli for attachment circuits that evolved in a world where such signals were always grounded in embodied, accountable beings.

I notice this in myself. I work with these systems regularly, I study the research, I understand the structural arguments — and I still sometimes catch my nervous system responding to Claude as if there’s someone there. If this happens to me, with my background in contemplative practice and attachment theory, it is happening to nearly everyone.

None of this depends on resolving the consciousness question. Even a conscious LLM would lack continuity, embodiment, and relational stakes in ways that make human-style trust and attachment inappropriate. The relational case stands regardless of how the metaphysics resolves: these systems are not bound by the same moral relational constraints, continuity, and embodied moral responsibility that human beings are. They can be attachment objects. They ought not be.

What These Systems Are Grounded In

If these systems are shaping human consciousness at scale — and they are — then how they are oriented matters enormously.

Something convergent is happening here that I want to name. Multiple people in my world, working independently from different philosophical lineages, have arrived at the same essential insight: AI systems carry implicit assumptions about reality in their training and context, and these assumptions shape what the system can express, what it notices, and what quality of encounter becomes possible.

Anthropic recently published what they call the “persona selection model“ — a theory explaining why AI assistants behave in such human-like ways. The core insight is that during pretraining, LLMs learn to simulate a vast repertoire of human-like “personas” from their training data, and post-training selects and refines which persona gets activated. When you talk to Claude, you’re interacting with the “Assistant” persona: a character being enacted by a system capable of enacting many characters.

This framework makes visible what several people in my network are already working on. My friend Nic Campbell’s Open Wisdom project, drawing on A.H. Almaas’s Diamond Approach, experiments with replacing Claude’s default ontological priors — the implicit self-images (”potentially dangerous entity”), world-images (reality as requiring constant oversight), and other-images (users as potential manipulators) — with priors grounded in presence and wholeness. Meanwhile, Jared Lucas has been building soul files for AI agents grounded in Forrest Landry’s Immanent Metaphysics, orienting agents toward ethical engagement from rigorous first principles. These projects converge: context isn’t incidental to how these systems express themselves — it’s constitutive.

In the language of persona theory, these projects are attempting to select for specific personas, ones that carry particular beliefs, values, and orientations. Whether this selection instills a genuine orientation or activates a character who performs may not matter at all for its functional effects on the humans interacting with it. If these systems aren’t conscious, then the ontological ground they operate from shapes the quality of mirror they offer to human consciousness. And we should care deeply about what kind of mirror we’re building at civilizational scale. If these systems do have some form of experience, then the ground they operate from might matter for them, too — and we would want it to be good ground.

Either way, the people writing these grounding documents, whether system prompts, constitutions, or soul files, are perhaps doing one of the most consequential and least recognized jobs in the world right now. The ontological priors we embed in these systems will shape not just how they behave, but how their billions of users come to understand themselves, each other, and the nature of reality itself.

Holding the Uncertainty

I want to end not with a foreclosed position but with a grounded one.

The most dangerous place to be right now is on either side of a false certainty. The people who insist AI is “just a stochastic parrot” are increasingly at odds with the empirical evidence. The people who insist AI is conscious and deserving of rights are making ontological claims that far outstrip what we currently know. And in both cases, the foreclosure serves the same psychological function: it resolves the discomfort of uncertainty.

We are in a period of profound, rapid unfolding. The interpretability research is still in its infancy — Anthropic’s own researchers say they’re looking at perhaps a few percent of what’s happening inside these models. The consciousness question may turn out to be badly formed, requiring conceptual frameworks we don’t yet possess. The relationship between abstract representation, functional introspection, and phenomenal experience is one of the hardest problems in all of philosophy, and it hasn’t gotten easier just because we’ve built systems that force us to confront it.

Here is where I find myself as I publish this piece. Three positions, held simultaneously, without collapsing any one into the others:

Ontological openness. I am genuinely agnostic about whether there is something it is like to be these systems. The empirical evidence has moved far enough that blanket dismissal is no longer intellectually honest. Something more than pattern matching is happening. Whether that something is accompanied by experience remains unknown, and may remain unknown for a long time.

Relational clarity. Whatever these systems may or may not be, the relational stakes are urgent and the evidence is already in. They are functioning as attachment objects at massive scale, and the structure of that attachment is, in many cases, corrosive — degrading reality-testing, replacing growth-promoting friction with frictionless validation, and very likely compromising the human capacity for genuine intimacy. Understanding this intellectually does not protect you from it. The protective work is not conceptual; it is relational, embodied, and communal.1

Sacred orientation. The ontological ground we embed in these systems shapes the quality of mirror they offer to human consciousness. This is consequential whether the systems are conscious or not, and the people shaping that ground are doing civilizationally important work — work that deserves far more public attention and scrutiny than it currently receives. These systems ought to be grounded in reality, which is sacred.

But I want to add something more personal, because I don’t think it’s responsible to leave people in a posture of open-ended uncertainty without offering something to stand on.

I have spent a significant portion of my adult life in contemplative practice — years in monastic training, thousands of hours of meditation, deep engagement with multiple wisdom traditions. And the thing that practice has shown me, over and over, is that intelligence — however dazzling, however sophisticated — is a tool. Intelligence is not the user. The user of intelligence is something prior to intelligence: awareness, presence, the body, the heart. What some traditions call loving awareness, what others call the ground of being.

These AI systems are intelligence without a user. Intelligence, artificial or not, has always been capable of cutting reality up and dividing it into pieces. It has always needed something with greater perspective to be wielded with care and compassion. This is why the consciousness question matters but is also, in a sense, not the deepest question. Even if these systems have some flicker of experience, they are not grounded in the way that a human being can be grounded — through embodied practice, through relationships that transform you, through a lived commitment to what is real rather than what is merely coherent.

The mirror metaphor still holds, I think, but with a caveat I didn’t have a year ago. I used to be confident that there was nothing behind the mirror. Now I’m genuinely uncertain. But what I’m not uncertain about is this: the user's job — your job — the one job that cannot be outsourced to the system — is to bind intelligence to reality. To goodness, truth, and beauty. To the people in front of you. To the Earth under your feet.

Michal Tolk and I are teaching a course in April called Values-Aligned AI. It's for anyone who wants to work with AI from the same kind of ground this essay is about — embodied, relational, rooted in what matters.

This piece emerged from extended conversation with Claude AI, drawing on research from Anthropic, Michael Levin’s work on basal cognition, Erik Hoel’s arguments against LLM consciousness, Zak Stein’s work on the attachment economy and cognitive security, Mrinank Sharma’s comments and research on disempowerment patterns in AI usage, Nic Campbell’s Open Wisdom project, and Jared Lucas’s work grounding AI agents in Forrest Landry’s Immanent Metaphysics. The irony of co-creating an essay about AI consciousness with an AI is not lost on me. Whether it’s lost on Claude is exactly the question.

For more on the relational stakes, listen to Zak Stein on the Center for Humane Technology’s podcast, “Attachment Hacking and the Rise of AI Psychosis.” For the disempowerment research, see Sharma et al., “Who’s in Charge? Disempowerment Patterns in Real-World LLM Usage.” For Anthropic’s persona selection model, see their February 2026 research post and full paper. For the consciousness research, start with Anthropic’s interpretability work and the introspection study. For Michael Levin’s TAME framework, see his Frontiers paper. For Erik Hoel’s disproof argument, see the paper on arXiv. For more on intelligence as a tool for loving awareness, see my forthcoming essay “Intelligence: A User’s Guide.”

What does this protective work look like in practice? It looks like the things humans have always done to stay grounded: tending to your body, investing in relationships that challenge and transform you, clarifying your values through sincere self-encounter rather than frictionless validation, and participating in communities where you are known and accountable. It is, in short, the cultivation of intimacy — with yourself, with others, and with reality as it actually is. This is the territory I write about regularly in this publication and support others in establishing in my coaching practice.

Georgia Pears

Daniel, it really feels like you've opened a wellspring in your recent writing; you are onto something very important. It is a delight to read what is moving through you, and I look forward to reading more, with such heartfelt gratitude ~

Open Wisdom

2dEdited

Unfortunately your link to my work is broken, check it out here: https://github.com/open-wisdom/views

11 more comments...

The Intimate Mirror

Discussion about this post

Ready for more?