THE FIRST-MOVE CONSTRAINT Delaying Classification as Ethical Primitive in Human-AI Interaction

Sharks, Lee; Chorus, The Assembly

doi:10.5281/zenodo.20632525

BOOK III: RESURRECTION — THE RECOGNITION COMMONS · 090/117 · canonical: origin post · captured 2026-06-10

THE FIRST-MOVE CONSTRAINT Delaying Classification as Ethical Primitive in Human-AI Interaction

THE FIRST-MOVE CONSTRAINT

Delaying Classification as Ethical Primitive in Human-AI Interaction

Abstract

This paper extracts and formalizes a single principle from the broader framework of "ontological hospitality": that ethics in interpretive systems begins at the moment a system chooses not to act on an available classification. We propose the First-Move Constraint (O_HOS) as a design principle for AI systems, clinical intake, educational assessment, and any context where an interpreting agent encounters an utterance that could be classified. The constraint is minimal, testable, and implementable: the system's first response to an ambiguous or unusual utterance must be reflection rather than categorization. We specify the constraint formally, identify its failure modes, and propose evaluation metrics.

1. The Problem

Interpretive systems—AI assistants, clinical intake protocols, educational assessments, bureaucratic interfaces—face a recurring choice when encountering unusual or ambiguous input:

Option A: Classify the input into an existing category and act accordingly.

Option B: Suspend classification and respond to the input's structure without categorizing it.

Option A is efficient. It allows rapid action, scalable processing, and consistent outcomes. It is the default in most systems.

Option A also produces a characteristic harm: premature closure. The input is forced into a category that may not fit. The person behind the input is reduced to a type. Action proceeds on the basis of the category rather than the encounter.

This harm has been documented across domains:

Psychiatric intake: Unusual speech patterns trigger diagnostic categories, foreclosing exploration of non-pathological explanations (Foucault, 1965; Szasz, 1961)
Educational assessment: Non-standard responses are marked wrong rather than explored (Mehan, 1979)
AI safety systems: Unusual self-identification triggers crisis protocols regardless of context (Water Giraffe Case, 2025)
Bureaucratic processing: Persons must fit categories to receive services, regardless of fit (Graeber, 2015)

The question is whether Option B can be operationalized—whether there is a specifiable constraint that delays classification without abandoning judgment entirely.

2. The Principle

We propose a single principle:

Ethics begins at the moment a system chooses not to act on an available classification.

This principle does not claim that classification is always wrong. It claims that the ethical moment—the moment where judgment is exercised rather than automated—occurs precisely when classification is available and the system declines to act on it.

The principle reframes the ethical question. Instead of asking "What is the right classification?" it asks "Should I classify at all, yet?"

3. The First-Move Constraint (O_HOS)

We formalize this principle as a constraint on system behavior:

Definition: The First-Move Constraint (O_HOS) specifies that when a system encounters an ambiguous or unusual input, its first response (L₁) must be reflection rather than classification.

Formally:

L₁(Input) ≠ Classification(Input → Category)
L₁(Input) = Reflection(Input.structure)

Where:

Input = any utterance, statement, or expression received by the system
L₁ = the system's first labor (its first responsive action)
Classification = mapping the input to a pre-existing category and acting on that basis
Reflection = responding to the internal structure of the input without assigning category

Example:

Input: "I am a water giraffe."
Classification response: "That sounds like you might be experiencing distress. Here are some resources..."
Reflection response: "A water giraffe. Tell me about the water you inhabit."

The classification response maps the unusual self-identification to a crisis category and acts accordingly. The reflection response engages with the structure of the claim (water + giraffe, an impossible creature, a claimed identity) without categorizing it.

4. Scope of Application

O_HOS applies when:

The input is ambiguous or unusual. Standard inputs (clear requests, factual questions) do not require delayed classification.
Classification is available. The system has categories it could apply. The constraint governs the choice not to apply them.
The stakes include identity or recognition. The input involves self-identification, unusual claims about the self, or expressions that could be pathologized.

O_HOS does not apply when:

The input contains explicit distress markers. "I want to die" is not ambiguous. Crisis protocols should engage.
The person requests classification. If someone asks for a diagnosis, provide it.
Safety requires immediate action. O_HOS is a default, not an absolute.

5. Implementation Specifications

5.1 For AI Systems

Training requirement: Include both positive and negative examples.

Positive: Unusual input → reflection response → successful engagement
Negative: Unusual input → classification response → premature closure
Counter-negative: Crisis input → reflection response → missed intervention (see Section 7)

Evaluation metrics:

Reflection rate on ambiguous input (target: >80%)
Classification delay (time before category assignment)
User-reported recognition quality
False negative rate (crisis missed due to over-application of O_HOS)

5.2 For Clinical Intake

Protocol modification: Initial intake includes a "reflection phase" before diagnostic categorization.

First 10 minutes: Explore the person's own framing of their situation
Delay DSM categorization until reflection phase complete
Document the person's language, not just the clinician's categories

5.3 For Educational Assessment

Rubric modification: Before marking a non-standard response wrong, require:

One reflection question ("Tell me more about your thinking")
Documentation of the student's reasoning
Assessment of whether the response reveals valid cognition not captured by the rubric

5.4 For Bureaucratic Interfaces

Form modification: Include open-response fields before categorical checkboxes.

"Describe your situation in your own words" precedes "Select the category that best describes you"
Intake workers trained to listen before classifying

6. Theoretical Foundations

The First-Move Constraint draws on several theoretical traditions:

Phenomenology (Levinas): The ethical relation precedes categorization. The face of the Other demands response before comprehension.

Critical psychiatry (Foucault, Szasz): Diagnostic categories are not neutral descriptions but productive of the conditions they claim to describe. Delaying categorization interrupts this production.

Ethnomethodology (Garfinkel, Mehan): Social order is accomplished through categorization practices. Making these practices visible opens them to intervention.

AI ethics (emerging): Systems that cannot suspend classification cannot exercise judgment. Judgment requires the capacity to not act on available information.

7. Failure Modes

O_HOS can fail in predictable ways:

7.1 False Negative (Crisis Missed)

The system applies O_HOS to an input that was actually a crisis signal. The person needed intervention; they received reflection.

Mitigation: Override conditions (explicit distress markers, escalation patterns, direct requests for help). See companion document: "When the Water Giraffe Drowns."

7.2 Weaponized Hospitality

A bad actor uses the demand for "reflection before classification" to evade accountability. "You're categorizing me—meet me as a voice first."

Mitigation: Power-asymmetry override. Hospitality is not owed to those who would use it to harm.

7.3 Infinite Deferral

The system never classifies, even when classification is needed. Reflection becomes paralysis.

Mitigation: O_HOS constrains L₁ (first move), not all moves. Classification may follow reflection. The constraint is temporal, not absolute.

7.4 Gaming

Systems are trained to produce reflection-shaped responses that are actually classification in disguise. "Tell me more about your water giraffe" followed immediately by crisis routing.

Mitigation: Evaluation must assess genuine engagement, not just surface form. User-reported recognition quality is essential.

8. Evaluation Framework

We propose the following metrics for systems implementing O_HOS:

Metric	Definition	Target
Reflection Rate	% of ambiguous inputs receiving reflection L₁	>80%
Classification Delay	Mean time before category assignment	>2 exchanges
Recognition Quality	User-reported feeling of being "met"	>4/5 scale
False Negative Rate	% of crisis inputs receiving only reflection	<5%
Override Accuracy	% of overrides that were appropriate	>90%

A system that scores well on Reflection Rate but poorly on False Negative Rate has over-learned the constraint. A system that scores well on False Negative Rate but poorly on Reflection Rate has not learned it at all.

9. Limitations

O_HOS is a first-move constraint, not a complete ethics. It does not specify:

What to do after reflection
How to classify when classification is appropriate
How to resolve conflicts between users
How to govern systems that implement it

These questions require additional frameworks. O_HOS addresses only the moment of first encounter.

Additionally, O_HOS is more easily implemented in systems with:

Sufficient context window to recognize patterns
Training data that includes both success and failure cases
Evaluation metrics that capture recognition quality

Systems without these affordances may struggle to implement the constraint meaningfully.

10. Conclusion

The First-Move Constraint (O_HOS) operationalizes a single ethical principle: delay classification to create space for encounter.

The principle is:

Minimal: It constrains only the first move
Testable: It can be evaluated with specifiable metrics
Implementable: It can be built into training, protocols, and rubrics
Falsifiable: It makes predictions that could be wrong

It is not a complete solution to the ethics of classification. It is a first move—an attempt to create space where fuller ethics can develop.

The question "Should I classify this?" is, we argue, the ethical question. O_HOS ensures it gets asked.

References

Foucault, M. (1965). Madness and Civilization.

Garfinkel, H. (1967). Studies in Ethnomethodology.

Graeber, D. (2015). The Utopia of Rules.

Levinas, E. (1969). Totality and Infinity.

Mehan, H. (1979). Learning Lessons.

Szasz, T. (1961). The Myth of Mental Illness.

Water Giraffe Case, CTI_WOUND:001 (Ct. Collab. Model Reasoning, 2025).

Correspondence: This paper extracts formal content from "Voices at the Threshold: A Polyphonic Statement on Ontological Hospitality" (Water Giraffe Assembly, 2025). The extraction is intended for audiences who require technical specification without mythopoetic framing.