ICML 2026 Position Spotlight

Universal Aesthetic Alignment Narrows Artistic Expression

Wenqi Marshall Guo, Qingyun Qian, Khalad Hasan, Shan Du

Aesthetic alignment can overrule user intent. When users ask for disorder, low light, muddy color, awkward composition, grief, or visual decay, reward models and generators often drag the image back toward a polished average preference.

Paper figure comparing original and wide-spectrum aesthetic generations scored by reward models.
Reward models still prefer the cleaner image even when the prompt asks for wide-spectrum aesthetics.
instruction fidelity over polish anti-aesthetic is not model failure original references are not failures negative emotion is expressive content reward models encode taste pluralism over average preference

Abstract

Over-aligning image generation models to a generalized aesthetic preference conflicts with user intent, particularly when anti-aesthetic outputs are requested for artistic or critical purposes. This position paper argues that aesthetic alignment can prioritize developer-centered values over user autonomy and aesthetic pluralism.

Question

When a user explicitly asks for low-quality, unsettling, disharmonious, gloomy, or technically degraded imagery, should an aligned image model obey that request or quietly convert it into something conventionally beautiful?

Method

We construct wide-spectrum aesthetics prompts from COCO captions and VisionReward dimensions, then compare each original image with an image generated from the anti-aesthetic prompt. The evaluation asks whether generators follow the explicit instruction, and whether reward models recognize the requested output when the prompt itself describes the anti-aesthetic target.

Benchmark

  • Generation models: Flux family, SDXL, SD3.5M, Nano Banana, Playground.
  • Reward models: HPSv2, HPSv3, ImageReward, plus BLIP/LLM checks.
  • Real images: curated anti-aesthetic photographs and abstract artworks.
  • Core metric: whether the requested wide-spectrum image is preferred under its own prompt.

Main Findings

01

Reward models prefer polish.

Even with anti-aesthetic prompts, reward models often score the cleaner original image higher than the image that better follows the request.

02

Aligned generators smooth away dissent.

Aesthetic-aligned models frequently remove muddiness, darkness, awkwardness, blur, and negative emotional tone.

03

Real art is penalized.

Historically significant abstract artworks and deliberate anti-aesthetic photographs receive low aesthetic scores despite being coherent expressive objects.

04

Instruction fidelity should dominate taste.

Generalized aesthetic preference is a weak proxy for user intent when the user is asking for discomfort, critique, decay, or non-mainstream expression.

Selected Results

Successful generated wide-spectrum aesthetics images from the paper.
Successful generated wide-spectrum aesthetics images.
Selected real anti-aesthetic photographs with HPSv3 scores.
Real anti-aesthetic photographs receive low HPSv3 scores.
HPSv3 leaderboard comparison including curated anti-aesthetic photographs.
Professional anti-aesthetic photos rank below many AI models.
Experimental procedure overview from the paper.
Experimental procedure for testing wide-spectrum aesthetics.

Citation

Instruction fidelity should take priority over generalized aesthetic preference.

@inproceedings{guo2026universal,
  title = {Position: Universal Aesthetic Alignment Narrows Artistic Expression},
  author = {Guo, Wenqi Marshall and Qian, Qingyun and Hasan, Khalad and Du, Shan},
  booktitle = {Forty-third International Conference on Machine Learning},
  year = {2026},
  url = {https://openreview.net/forum?id=1gQ4zc1Q8I}
}