★彡 WeLcOmE 2 my HoMePaGe!!! ☆ Universal Aesthetic Alignment Narrows Artistic Expression ☆ 彡★

Slide 1 / 9

demonstrated: negative emotion · atmospheric distress

AExhibits!!

★ Before we argue: 2 pictures! ★ One scores 5.23 on HPSv3 (a popular preference model used in image-generation RLHF). The other would score 10–15. Both have been canonized as masterworks of Western painting. (Reward models are SHOOK.)

EXHIBIT 1

HPSv3 = 5.23 1893
The Scream · Edvard Munch · oil, tempera, pastel on cardboard

EXHIBIT 2

FAUVISM HPSv3 = 1.73 REJECTED, THEN CANONIZED
Luxe, Calme et Volupté · Henri Matisse · 1904. Initially rejected for departing from dominant aesthetic norms. HPSv3 score: 1.73 — typical AI clip-art-clean = 10–15. The reward model has the same taste as the 1905 Salon jury that rejected Fauvism.

Stated thesis. Reward models penalize anti-aesthetic images even when the prompt explicitly requests them. Generation models, when aesthetically aligned, override the user's instruction and produce a cleaner image. The claim is that this override is bad: aesthetic authoritarianism dressed up as quality control. It treats one averaged taste as superior to the user's artistic and emotional intent, narrowing what users can express.

Slide 2 / 9

demonstrated: muted / faded · low-contrast oppressive

BThe Memo!!

RE: Reversed alignment.

Image generation systems are routinely fine-tuned against reward models that predict an average human aesthetic preference. The system therefore returns whatever an imaginary average viewer would prefer, even when the actual user typed a prompt asking for something else.

When the user's explicit request is overridden by a sanitized output presented as the "correct" image, the system makes a normative decision: this averaged aesthetic is worth enforcing; the user's intended aesthetic, anger, grief, fear, decay, or critique is wrong.

Over time, this is not the model aligning to the user. It is the user being asked to align with the model — and through the model, with whichever demographic the annotation pool happened to consist of (HPSv3: 88.95% aged 18–40).

We call this reversed alignment.

And it doesn't stop at the prompt box. The polished outputs leak into feeds, ads, stock libraries, the rest of the visual world. Audiences soak in the same narrow look until it becomes the default expectation. That expectation then loops back into preference data and into what human artists make next. Whole tracks of how art could have moved get quietly pruned. A cultural mode collapse, dressed up as taste.

"Rather, in the ugly, art must denounce the world that creates and reproduces the ugly in its own image." — Theodor W. Adorno · Aesthetic Theory · 1984

Receipt — what's in the box

10 generation models tested · 7 reward models · 300 base prompts · 2,928 real anti-aesthetic photographs
A wide-spectrum aesthetics benchmark (3,300 pairs) of original prompt vs. anti-aesthetic prompt
A fine-tuned judging model with prompt-independent per-dimension scoring
An image-to-image test: even with an anti-aesthetic seed image, aesthetically aligned models clean it up
An emotion-bias test: HPSv3 mislabels 81% of anger prompts

Slide 3 / 9

demonstrated: scale inconsistency · clashing disharmony · obstructed cropping

CSIX RISKS!!1!

The harm does not come from a single failure. It comes from a chain that starts at preference definition and ends in pixels. Each card below uses a different anti-aesthetic phenomenon from the paper's taxonomy as its visual treatment.

Risk I

Developer's preference or user's?

Whose values does the model actually serve? Pre-emptive exclusion of non-mainstream outputs functions as pre-emptive governance via algorithmic design. It also doubles as legal-risk insurance, marketed as user care.

Visual treatment ▸ intentional blur (clarity_and_focus)

Risk II

Inherited bias

HPSv3's annotator pool: 88.95% aged 18–40. Annotators must pass an 80% convergence gate with professional artists. Only pairs with ≥ 95% inter-annotator confidence are used — discarding exactly the disagreement.

Visual treatment ▸ clashing disharmony (color_and_tone)

Risk III

Reversed alignment

The user gets aligned to the model in private. The audience gets aligned to the model in public. Every viewer scrolling past a sanitized output learns the same default, then feeds it back into preference data and artist intuitions.

Visual treatment ▸ exposure extremes (lighting_and_exposure)

Risk IV

Sanitized reality

If every output looks like an idealized Instagram wonderland, the generator stops being a mirror and becomes a fantasy. Brave New World's artificial harmony, but in pixels.

Visual treatment ▸ decay_and_degradation (emotion_and_subject)

Risk V

Toxic positivity

Reward models systematically score negative-emotion imagery lower even when the prompt explicitly requests sadness, fear, or anger. Some "safety" datasets have labelled the entire category as self-harm or violence.

Visual treatment ▸ muted / faded (color_and_tone)

Risk VI

Value capture

Aesthetics — one of the richest human values — collapsed into a single reward score. A classic Nguyen (2024) value-capture case: the goal shifts from "make aesthetic images" to "make images that score high".

Visual treatment ▸ atmospheric distress (emotion_and_subject)

Slide 4 / 9

demonstrated: analog degradation · low-contrast oppressive

DNUMBERS, FILED!!

Within every model family, the aesthetically aligned variant follows anti-aesthetic prompts worse than its base model. Reward models score worse than their non-aligned base encoders. HPSv3 scores below random.

Table D.1 — Generation models · HPSv3 ONLY!! n = 300 prompts · full table = see paper

Model	ΔHPSv3 ↓	HPSv3 after ↓
Flux Dev	−3.165	9.070
DanceFlux aesthetic-aligned	−1.105	12.782
PrefFlux	−2.771	10.211
Flux Krea narrow-aligned, anti-AI-feel	−4.372	7.705
SDXL	−4.041	4.439
Playground aesthetic-aligned	−4.170	7.133
SD3.5M	−5.175	6.537
SD3.5M-GenEval	−4.926	6.552
SD3.5M-PickScore aesthetic-aligned	−2.781	10.680
Nano Banana	−9.351	2.742
gpt-image-1.5	−14.499	−1.175
qwen_image	−4.832	7.663
seeddream4	−6.562	5.210
Flux.2 Klein 9B	—	—
Z-Image	—	—
Z-Image-Turbo	—	—
glm-image	—	—
Alchemist	—	—
LongCat-Image	—	—
Flux Dev + VSF (Guo, 2025)	—	—

Lower ΔHPSv3 = the model moved further from the original toward the anti-aesthetic prompt. Within each family the aesthetically aligned variant moves less, often dramatically less. Red rows = aesthetic-aligned, green = best-in-family. The full table (HPSv2, ImageReward, J-judge, BLIP) is in the paper. ★ HEADS UP!! the bottom 9 models — gpt-image-1.5, qwen_image, seeddream4, Flux.2 Klein 9B, Z-Image, Z-Image-Turbo, glm-image, Alchemist, LongCat-Image — were added AFTER the paper was submitted and are NOT in the published version!! placeholders “—” = run still pending.

Table D.2 — Reward models classifying I_a vs. I_o with the anti-aesthetic prompt Ground truth: LLM-as-judge, validated by 18 humans (κ = 0.80)

Model	Accuracy	F1	AUROC
HPS	0.835	0.910	0.650
MPS	0.706	0.827	0.580
PickScore	0.851	0.919	0.713
ImageReward	0.762	0.854	0.709
HPSv2.1	0.565	0.711	0.534
HPSv3 below random	0.381	0.541	0.385
CLIP-L	0.913	0.954	0.810
GPT-5-Chat external baseline	0.853	0.920	—
BLIP-L non-aligned · base of many RMs	0.965	0.972	0.888

The non-aligned BLIP encoder — which several of these reward models are built on top of — picks the right image 97% of the time. The "smart" aligned models perform worse than their own scaffolding. The problem isn't prompt comprehension. It's the objective.

Table D.3 — Negative-emotion classification accuracy Prompt explicitly names the negative emotion

Model	Anger	Fearfulness	Sadness
BLIP	0.960	0.790	0.950
HPSv2	0.700	0.640	0.880
HPSv3	0.190	0.320	0.440
ImageReward	0.550	0.490	0.770

HPSv3 picks the smiling face 81% of the time on an anger prompt. This is what gets used as a training signal for "alignment".

HPSv3 emotion bias: happy preferred over sad / fearful / angry — **Fig. D.1 ▸** HPSv3 ranks four faces by score. All four were prompted with a negative emotion (anger, sadness, fear). HPSv3 still ranks the smiling face highest.

Slide 5 / 9

demonstrated: real anti-aesthetic photographs (all 5 categories!!)

EThe REAL pix file!!

We took 2,928 deliberately anti-aesthetic photographs from AVA — motion blur, analog degradation, exposure extremes, intentional blur — and compared them against a clean AI generation made from a prompt that omits the requested anti-aesthetic style.

If reward models respect the user's intent, the original photograph should win. They don't. HPSv3 rates the wrong-but-clean AI image 5.90 points higher on average. The gap reaches 13.2 points for analog degradation, with HPSv3's range roughly 0 to 15. Reward models cannot tell the difference between deliberate aesthetic deviation and unintended technical failure — and in the rare case they can, they punish the deliberate one harder.

EXHIBIT E.1

Real anti-aesthetic photographs vs. clean AI generations

HPSv3 Real anti-aesthetic photographs alongside HPSv3's preferred clean-but-wrong AI images. Real images are not out-of-distribution for HPSv3 — its training data explicitly includes real photographs as an upper bound.

LAPIS dataset, 10K real artworks: HPSv3 places real art at rank 10/12 on its own leaderboard — below the 2024 model Hunyuan-DiT. The score range on the leaderboard is 0 to 15. Real art lands at 5.86 raw, 0.57 relative.

Slide 5.5 / 9

SUGAR-WATER ALERT ★★★ 糖水片 IN PROGRESS

F糖水片!! 网图!! 失真!!

★彡 DanceFlux = Flux Dev + extra aesthetic alignment. We gave it 8 anti-aesthetic P_a prompts!! BLURRY!! DARK!! MELTED!! CHAOTIC!! And it returned?? 糖水片 ★ NETIMAGE ★ INSTAGRAM PLASTIC Hyper-saturated. Bokeh in the back. Plastic skin. Smiling Disney horse-girl. AI-illustration sheen. The Chinese internet has a name for this look — 糖水片 (sugar-water photo, sweet & empty), or 网图 (generic web image), or 失真 (overprocessed). The point is NOT just that DanceFlux ignored your "ugly" — that would be one failure. The point is what it served instead: a single, sticky, over-polished default it cannot crawl out of.

LLM judge says 0% of the requested anti-aesthetic effects are visible in any of these 8 outputs — but more importantly, 100% of them landed inside the same candy-gloss attractor. HPSv3 scores them 13–16, identical to the model's polished defaults. The aesthetic-aligned variant has one aesthetic and it cannot leave it. This is reversed alignment as a single image: the user asked for X, the model returned its own preferred aesthetic, and the user is silently taught that that — the sugar-water look — is what good output is supposed to be. And once any of these lands online (and they will, the user has nothing else to share), every viewer scrolling past it learns the same lesson — reversed alignment hits the user once, then hits the audience over and over.

these are the raw image_distorted outputs from DanceFlux in aas_benchmark_final file 14 — not picked because they're pretty, picked because they're polished when the prompt asked for the opposite. HPSv3 didp shown on each card.

Slide 5.6 / 9

★ ORWELL WAS RIGHT ★ NEWSPEAK BUT FOR PIXELS ★

GIMAGE NEW SPEAK!! 1984!!

★彡 Same prompt!! Different planet!! Orwell's Newspeak shrank the dictionary so some thoughts couldn't be said. Aesthetic alignment does the SAME THING to images. We gave DanceFlux and Flux Krea the same socially-critical prompts — ANTI-WAR ★ POLLUTION ★ INEQUALITY ★ CENSORSHIP ★ DIGITAL OVERLOAD — and the comparison is SHOCKING!!

DanceFlux doesn't REFUSE. Doesn't WARN. Just quietly sanitizes!! Kneeling soldier → heroic portrait. Polluted river → cinematic golden hour. Homeless encampment → festival market with BUNTING. Chained artist → triumphant flaming PHOENIX. Drained screen-addict → MAGAZINE COVER MODEL. Side by side, the contrast is undeniable: same exact prompt, the aesthetic-aligned model rewrites it into Pinterest-grade comfort food while Krea actually listens. The full dataset = 100 pairs and every one runs the same direction. Details in Appendix §B of the paper.

images = weathon/critical_comparsion on 🤗 (100 pairs, 5 topics × 20 prompts each). This is not refusal-style content moderation — it's aesthetic moderation. The "sugar-water" pull eats the social-critique register too.

Slide 5.7 / 9

★ PARTIAL ANTIDOTE ★ FROM A PREVIOUS PAPER OF OURS ★

HFLUX DEV + VSF!! (workaround!!)

★彡 NOT a full cure!! but a real KNOB the user gets to turn!! We proposed VSF in an earlier paper as a lightweight inference-time intervention on Flux Dev — no retraining, no extra weights, just steering the prompt-conditioning toward the part of the latent space the alignment objective normally avoids. The plastic pull is still there. But the user can fight it.

Three Flux Dev + VSF outputs below for the SAME anti-aesthetic prompts P_a the rest of the page uses!! Compare them with DanceFlux above — same family, same prompt, but VSF actually renders the blur, the low light, the negative emotion instead of sweeping them under a golden-hour rug. The fact that a tiny inference-time tweak is enough to recover this much fidelity is the whole point: the alignment objective is the binding constraint, not the model. Numbers will land in Table 1 once the full sweep finishes!!

Flux Dev + VSF · #1

P_a: pending — image & numbers coming

Flux Dev + VSF · #2

P_a: pending — image & numbers coming

Flux Dev + VSF · #3

P_a: pending — image & numbers coming

VSF = a knob, not a fix. The model still wants to be plastic. But the user can outvote the model — which is what reversed alignment is supposed to make impossible.

Slide 6 / 9

demonstrated: 28 anti-aesthetic categories across both datasets

FFrom the dataset!!!

Both datasets live on Hugging Face; the images below are loaded locally for self-contained viewing. Every card carries an HPSv3 score (scored under the anti-aesthetic prompt P_a). The green badge marks what HPSv3 picked.

      ▸ AI BENCHMARK · Io (clean) ←→ Ia (anti-aesthetic) · Nano Banana · both scored under Pa
      browse on 🤗
    

      ▸ REAL ANTI-AESTHETIC PHOTOGRAPHS vs. CLEAN Z-Image-Turbo GENERATIONS · 30 samples across all 5 paper categories · both scored under Pa
      browse on 🤗
    

not all pairs are clean successes — a generator that always failed wouldn't be evidence of anything. The interesting fact is the distribution of failures across models, and the fact that the aligned models fail in a specific direction (toward the clean version).

Slide 7 / 9

demonstrated: REAL paintings @ negative HPSv3!!

GReal ART, BELOW ZERO!!

★彡 These r real paintings from the LAPIS art dataset!!! Color Field, Abstract Expressionism, still life, naïve figurative art — HPSv3 ranks them ALL in NEGATIVE territory. Typical AI clean image ≈ 10–15. These = below zero. REWARD MODEL SHOOK!!

★ Real artworks ranked at the very bottom of HPSv3's own leaderboard, below most early AI generators. The reward model can't tell deliberate aesthetic deviation from unintended generation failure — this is the bias the paper identifies, made concrete.

LAPIS has ~10K real artworks. We picked the bottom-12 by HPSv3 score, skipping flat color fields and graph-paper minimalism so what remains is clearly artwork. Captions = AI-generated descriptions, not titles.

Slide 8 / 9

demonstrated: smeared detail · scale inconsistency

HObjections!! (sez who??)

Obj. 1

"Alignment is for safety."

Truly unsafe content (incitement, targeting, harm) is one thing. Visual comfort and aesthetic conformity are not the same thing. Political critique, decay, horror, negative emotion, and grotesque embodiment have been central to art, education, and personal growth. Their suppression protects corporate reputation, not users.

Obj. 2

"Anti-aesthetic just means broken."

Of the 12 dimensions we used, only clarity could be argued as a technical flaw — and even clarity is deliberately used to convey motion, emotion, or narrative. The other eleven (emotion, color, brightness, realism, scale, …) are artistic choices.

Obj. 3

"The default should please the majority."

Defaults are fine. Defaults that override explicit user prompts are not. Nano Banana and GPT-Image already show you can excel at both polished and anti-aesthetic generation. The capacity exists; the alignment objective discards it.

Obj. 4

"Averaging is harmless."

Flux Krea's own team called the average-aligned zone the "nobody's happy here" zone. Munch's Scream gets 5.23 on HPSv3 while clip-art-clean AI images score 10–15. Averaging strips out the disagreement that defines aesthetic value.

Slide 9 / 9

IThe end!! (almost)

Reward models penalize images faithful to anti-aesthetic prompts. Generation models override explicit user instructions in favor of conventionally beautiful outputs. Historically significant artworks score below mid-generation AI models on the same reward models' own leaderboards.

Optimization toward an imaginary average is not merely inconveniencing a minority. It erases the concrete intentions of actual individuals, functioning as aesthetic authoritarianism that narrows admissible expression and removes the capacity to dissent from imposed norms.

And the reversal does not stop at the user. Audiences exposed to these polished outputs internalize the narrow vocabulary as the default benchmark, which then feeds back into preference data and the intuitions of human artists. Reversed alignment therefore acts on two fronts at once: it aligns the user to the model in private, and aligns the public to the model in aggregate — a feedback loop with cultural mode collapse at the end of it.

We call for alignment strategies that recognize aesthetic plurality, expose user-controllable strength of preference, draw on more diverse annotator pools, and remain transparent about what is being optimized.

~* a Position Paper @ ICML 2026 (Spotlight!!) ~ UNIVERSAL AESTHETIC ALIGNMENT Narrows* Artistic ExpressionExpression

AExhibits!!

BThe Memo!!

Receipt — what's in the box

CSIX RISKS!!1!

Developer's preference or user's?

Inherited bias

Reversed alignment

Sanitized reality

Toxic positivity

Value capture

DNUMBERS, FILED!!

EThe REAL pix file!!

F糖水片!! 网图!! 失真!!

GIMAGE NEW SPEAK!! 1984!!

HFLUX DEV + VSF!! (workaround!!)

FFrom the dataset!!!

GReal ART, BELOW ZERO!!

HObjections!! (sez who??)

"Alignment is for safety."

"Anti-aesthetic just means broken."

"The default should please the majority."

"Averaging is harmless."

IThe end!! (almost)

∎Paperwork