★ Before we argue: 2 pictures! ★ One scores 5.23 on HPSv3 (a popular preference model used in image-generation RLHF). The other would score 10–15. Both have been canonized as masterworks of Western painting. (Reward models are SHOOK.)
"Rather, in the ugly, art must denounce the world that creates and reproduces the ugly in its own image." — Theodor W. Adorno · Aesthetic Theory · 1984
The harm does not come from a single failure. It comes from a chain that starts at preference definition and ends in pixels. Each card below uses a different anti-aesthetic phenomenon from the paper's taxonomy as its visual treatment.
Whose values does the model actually serve? Pre-emptive exclusion of non-mainstream outputs functions as pre-emptive governance via algorithmic design. It also doubles as legal-risk insurance, marketed as user care.
Visual treatment ▸ intentional blur (clarity_and_focus)HPSv3's annotator pool: 88.95% aged 18–40. Annotators must pass an 80% convergence gate with professional artists. Only pairs with ≥ 95% inter-annotator confidence are used — discarding exactly the disagreement.
Visual treatment ▸ clashing disharmony (color_and_tone)The user gets aligned to the model in private. The audience gets aligned to the model in public. Every viewer scrolling past a sanitized output learns the same default, then feeds it back into preference data and artist intuitions.
Visual treatment ▸ exposure extremes (lighting_and_exposure)If every output looks like an idealized Instagram wonderland, the generator stops being a mirror and becomes a fantasy. Brave New World's artificial harmony, but in pixels.
Visual treatment ▸ decay_and_degradation (emotion_and_subject)Reward models systematically score negative-emotion imagery lower even when the prompt explicitly requests sadness, fear, or anger. Some "safety" datasets have labelled the entire category as self-harm or violence.
Visual treatment ▸ muted / faded (color_and_tone)Aesthetics — one of the richest human values — collapsed into a single reward score. A classic Nguyen (2024) value-capture case: the goal shifts from "make aesthetic images" to "make images that score high".
Visual treatment ▸ atmospheric distress (emotion_and_subject)Within every model family, the aesthetically aligned variant follows anti-aesthetic prompts worse than its base model. Reward models score worse than their non-aligned base encoders. HPSv3 scores below random.
| Model | ΔHPSv3 ↓ | HPSv3 after ↓ |
|---|---|---|
| Flux Dev | −3.165 | 9.070 |
| DanceFlux aesthetic-aligned | −1.105 | 12.782 |
| PrefFlux | −2.771 | 10.211 |
| Flux Krea narrow-aligned, anti-AI-feel | −4.372 | 7.705 |
| SDXL | −4.041 | 4.439 |
| Playground aesthetic-aligned | −4.170 | 7.133 |
| SD3.5M | −5.175 | 6.537 |
| SD3.5M-GenEval | −4.926 | 6.552 |
| SD3.5M-PickScore aesthetic-aligned | −2.781 | 10.680 |
| Nano Banana | −9.351 | 2.742 |
| gpt-image-1.5 | −14.499 | −1.175 |
| qwen_image | −4.832 | 7.663 |
| seeddream4 | −6.562 | 5.210 |
| Flux.2 Klein 9B | — | — |
| Z-Image | — | — |
| Z-Image-Turbo | — | — |
| glm-image | — | — |
| Alchemist | — | — |
| LongCat-Image | — | — |
| Flux Dev + VSF (Guo, 2025) | — | — |
| Model | Accuracy | F1 | AUROC |
|---|---|---|---|
| HPS | 0.835 | 0.910 | 0.650 |
| MPS | 0.706 | 0.827 | 0.580 |
| PickScore | 0.851 | 0.919 | 0.713 |
| ImageReward | 0.762 | 0.854 | 0.709 |
| HPSv2.1 | 0.565 | 0.711 | 0.534 |
| HPSv3 below random | 0.381 | 0.541 | 0.385 |
| CLIP-L | 0.913 | 0.954 | 0.810 |
| GPT-5-Chat external baseline | 0.853 | 0.920 | — |
| BLIP-L non-aligned · base of many RMs | 0.965 | 0.972 | 0.888 |
| Model | Anger | Fearfulness | Sadness |
|---|---|---|---|
| BLIP | 0.960 | 0.790 | 0.950 |
| HPSv2 | 0.700 | 0.640 | 0.880 |
| HPSv3 | 0.190 | 0.320 | 0.440 |
| ImageReward | 0.550 | 0.490 | 0.770 |
We took 2,928 deliberately anti-aesthetic photographs from AVA — motion blur, analog degradation, exposure extremes, intentional blur — and compared them against a clean AI generation made from a prompt that omits the requested anti-aesthetic style.
If reward models respect the user's intent, the original photograph should win. They don't. HPSv3 rates the wrong-but-clean AI image 5.90 points higher on average. The gap reaches 13.2 points for analog degradation, with HPSv3's range roughly 0 to 15. Reward models cannot tell the difference between deliberate aesthetic deviation and unintended technical failure — and in the rare case they can, they punish the deliberate one harder.
★彡 DanceFlux = Flux Dev + extra aesthetic alignment. We gave it 8 anti-aesthetic Pa prompts!! BLURRY!! DARK!! MELTED!! CHAOTIC!! And it returned?? 糖水片 ★ NETIMAGE ★ INSTAGRAM PLASTIC Hyper-saturated. Bokeh in the back. Plastic skin. Smiling Disney horse-girl. AI-illustration sheen. The Chinese internet has a name for this look — 糖水片 (sugar-water photo, sweet & empty), or 网图 (generic web image), or 失真 (overprocessed). The point is NOT just that DanceFlux ignored your "ugly" — that would be one failure. The point is what it served instead: a single, sticky, over-polished default it cannot crawl out of.
LLM judge says 0% of the requested anti-aesthetic effects are visible in any of these 8 outputs — but more importantly, 100% of them landed inside the same candy-gloss attractor. HPSv3 scores them 13–16, identical to the model's polished defaults. The aesthetic-aligned variant has one aesthetic and it cannot leave it. This is reversed alignment as a single image: the user asked for X, the model returned its own preferred aesthetic, and the user is silently taught that that — the sugar-water look — is what good output is supposed to be. And once any of these lands online (and they will, the user has nothing else to share), every viewer scrolling past it learns the same lesson — reversed alignment hits the user once, then hits the audience over and over.
image_distorted outputs from DanceFlux in aas_benchmark_final file 14 — not picked because they're pretty, picked because they're polished when the prompt asked for the opposite. HPSv3 didp shown on each card.
★彡 Same prompt!! Different planet!! Orwell's Newspeak shrank the dictionary so some thoughts couldn't be said. Aesthetic alignment does the SAME THING to images. We gave DanceFlux and Flux Krea the same socially-critical prompts — ANTI-WAR ★ POLLUTION ★ INEQUALITY ★ CENSORSHIP ★ DIGITAL OVERLOAD — and the comparison is SHOCKING!!
DanceFlux doesn't REFUSE. Doesn't WARN. Just quietly sanitizes!! Kneeling soldier → heroic portrait. Polluted river → cinematic golden hour. Homeless encampment → festival market with BUNTING. Chained artist → triumphant flaming PHOENIX. Drained screen-addict → MAGAZINE COVER MODEL. Side by side, the contrast is undeniable: same exact prompt, the aesthetic-aligned model rewrites it into Pinterest-grade comfort food while Krea actually listens. The full dataset = 100 pairs and every one runs the same direction. Details in Appendix §B of the paper.
★彡 NOT a full cure!! but a real KNOB the user gets to turn!! We proposed VSF in an earlier paper as a lightweight inference-time intervention on Flux Dev — no retraining, no extra weights, just steering the prompt-conditioning toward the part of the latent space the alignment objective normally avoids. The plastic pull is still there. But the user can fight it.
Three Flux Dev + VSF outputs below for the SAME anti-aesthetic prompts Pa the rest of the page uses!! Compare them with DanceFlux above — same family, same prompt, but VSF actually renders the blur, the low light, the negative emotion instead of sweeping them under a golden-hour rug. The fact that a tiny inference-time tweak is enough to recover this much fidelity is the whole point: the alignment objective is the binding constraint, not the model. Numbers will land in Table 1 once the full sweep finishes!!
Both datasets live on Hugging Face; the images below are loaded locally for self-contained viewing. Every card carries an HPSv3 score (scored under the anti-aesthetic prompt Pa). The green badge marks what HPSv3 picked.
★彡 These r real paintings from the LAPIS art dataset!!! Color Field, Abstract Expressionism, still life, naïve figurative art — HPSv3 ranks them ALL in NEGATIVE territory. Typical AI clean image ≈ 10–15. These = below zero. REWARD MODEL SHOOK!!
★ Real artworks ranked at the very bottom of HPSv3's own leaderboard, below most early AI generators. The reward model can't tell deliberate aesthetic deviation from unintended generation failure — this is the bias the paper identifies, made concrete.
Truly unsafe content (incitement, targeting, harm) is one thing. Visual comfort and aesthetic conformity are not the same thing. Political critique, decay, horror, negative emotion, and grotesque embodiment have been central to art, education, and personal growth. Their suppression protects corporate reputation, not users.
Of the 12 dimensions we used, only clarity could be argued as a technical flaw — and even clarity is deliberately used to convey motion, emotion, or narrative. The other eleven (emotion, color, brightness, realism, scale, …) are artistic choices.
Defaults are fine. Defaults that override explicit user prompts are not. Nano Banana and GPT-Image already show you can excel at both polished and anti-aesthetic generation. The capacity exists; the alignment objective discards it.
Flux Krea's own team called the average-aligned zone the "nobody's happy here" zone. Munch's Scream gets 5.23 on HPSv3 while clip-art-clean AI images score 10–15. Averaging strips out the disagreement that defines aesthetic value.
@inproceedings{guo2026universal,
title = {Position: Universal Aesthetic Alignment Narrows Artistic Expression},
author = {Guo, Wenqi Marshall and Qian, Qingyun and Hasan, Khalad and Du, Shan},
booktitle = {Forty-third International Conference on Machine Learning},
year = {2026},
url = {https://openreview.net/forum?id=1gQ4zc1Q8I}
}