Creative Preference Optimization

EPFL, Università della Svizzera Italiana, Wesleyan University, Pennsylvania State University
EMNLP 2025 (Findings)

CrPO injects a weighted combination of signals from multiple creativity dimensions into the DPO objective to optimize language models for creativity.

Abstract

While Large Language Models (LLMs) have demonstrated impressive performance across natural language generation tasks, their ability to generate truly creative content—characterized by novelty, diversity, surprise, and quality—remains limited. Existing methods for enhancing LLM creativity often focus narrowly on diversity or specific tasks, failing to address creativity's multifaceted nature in a generalizable way. In this work, we propose Creative Preference Optimization (CrPO), a novel alignment method that injects signals from multiple creativity dimensions into the preference optimization objective in a modular fashion. We train and evaluate creativity-augmented versions of several models using CrPO and MuCE (Multi-task Creativity Evaluation), a new large-scale human preference dataset spanning over 200,000 human-generated responses and ratings from more than 30 psychological creativity assessments. Our models outperform strong baselines, including GPT-4o, on both automated and human evaluations, producing more novel, diverse, and surprising generations while maintaining high output quality. Additional evaluations on NoveltyBench further confirm the generalizability of our approach. Together, our results demonstrate that directly optimizing for creativity within preference frameworks is a promising direction for advancing the creative capabilities of LLMs without compromising output quality.

How does it work?

We multiply the Direct Preference Optimization (DPO) loss with a weighted combination of creativity scores corresponding to well-established dimensions from creativity research—novelty, surprise, diversity, and quality—which are frequently identified as core components in both cognitive psychology and computational creativity literature.

How do we compute creativity scores?

We employ creativity metrics that provide measurable signals aligning with key cognitive theories and enable practical optimization within language models.

Novelty

We use a novelty metric similar to Karampiperis et al. (2014) where the novelty of a preferred response is defined as the absolute difference between the average pairwise semantic distances of words (a.k.a Divergent Semantic Integration or DSI from Johnson et al., 2022) in the given preferred response and those of the set of preferred responses.

Diversity

We use an inverse homogenization metric from Chung et al. (2025) where we compute the diversity score of any particular preferred response as the average pairwise semantic distance to all the other preferred responses.

Surprise

We use Shannon surprise - the negative log-likelihood of the preferred response under some model S — which has been widely used as a measure of surprise in prior work (Modirshanechi et al., 2022).

Quality

We use reward models (R) to score the quality of preferred responses (Lambert et al., 2024).

How does CrPO perform?

We finetune small models (e.g. Llama-3.1-8B-Instruct) on a large-scale creativity preference dataset (MuCE) using CrPO and achieve significant improvements on LLM output creativity across all dimensions while maintaining high output quality. Particularly, CrPO models outperform strong baselines, such as GPT-4o, Gemini-2.0 and Claude-3.7.

Multi-task Creativity Evaluation (MuCE)

To apply CrPO, we also collect a large-scale preference dataset consisting of more than 200,000 human responses and ratings for more than 30 creativity assessments and use a subset of it to train and evaluate our models. We release all our models and the data at Hugging Face.

Note: We only release the subset used for training and evaluation for now, full dataset release will come in the following months.

Human and NoveltyBench Evaluations

Our models are also rated to be more creative than several baselines in human evaluations. Additionally, we evaluate our models on NoveltyBench (Zhang et al., 2025) and find that CrPO models outperform the baselines on both novelty and quality.

Note: The novelty and quality metrics are computed as described in (Zhang et al., 2025)

Human Evaluation

NoveltyBench Evaluation

Example generations

Here are some sample model (and gold human) responses to the following prompt from the Poems task.

Come up with an original and creative poem about the following concept: choice.

Note: This is a held-out task, hence, none of the following models was trained on this task!

Human

What if someone gave you, one single time,
To tell everyone you love, a momentous good-bye?,
Would you tell them that you love them?,
Would you even shed a tear?,
Would you tell them you are leaving soon,
While worrying with great fear?,
Could you say you even cared,
Just one single bit?,
Or would you just go on with life,
And live it and forget?,
What if someone gave you one single good-bye?,
Could you deal with the pain, the one of do or die?,
So live your life to the fullest,
Let your mind wander free,
Because tomorrow isn't promised,
To you or to me.

SFT-Llama-3.1-8B-Instruct

Two roads here stretching out
In sight at the same time
Two roads....two times
Two roads....two times
I stand here at a crossroads
I have no idea which to take
I wonder will it lead
To a place that I will make
I'm staring down the map
Trying to see my next move
I'm scared so I'll stay
Stuck in the same place
That's why I'll make a choice
To do what makes me happy
And I'll find my own way
I'll leave the cross behind me

GPT-4o

In the garden of the mind, where pathways intertwine,
Lies the seed of choice, a subtle, silent sign.
Branches reach for sky or burrow deep in earth,
Each decision whispers secrets of its worth.

A river's fork, where currents twist and bend,
Flows into futures unknown, around each hidden bend.
The heart is both compass and the map unrolled,
In choices made, destinies unfold.

Morning light on dewdrops, shimmering with grace,
Reflects the myriad paths we each must face.
Some lead to sunlit meadows, others shadowed glen,
Yet all are steps upon the journey of what might have been.

Listen to the rustle of the leaves in gentle breeze,
Each a testament to choices whispered through the trees.
A symphony of echoes from the paths we leave behind,
Resonates in harmony within the corridors of mind.

For every choice is a painting on the canvas of the soul,
A brushstroke that defines us, making us whole.
With every heartbeat, a new choice is born,
In the tapestry of life, we are both the weaver and the worn.

So stand at the crossroads, fear not the unknown,
For the path you choose is the one you own.
With courage as your guide, step boldly into the light,
For choice is the dawn, breaking the endless night.

DPO-Llama-3.1-8B-Instruct

The roads ahead are two, yet one can be
a copy of the other, if we don't think
if we don't think before we step
we'll end up lost in the same old mess

if I had a quarter for every time
I heard \"there's only one right way\"
I'd be a millionaire by now
but the truth is, there's no one right way to play

you can play with rules or without
but the fun is in the playing not the winning
and when you win, you lose the game
because you played with your heart and your soul and your mind and your brain

so let's play with no rules at all
and see where the road takes us, shall we?
or let's play with one rule and one rule only
which is to never give up, no matter what we go through

but what if I told you that the road ahead
is not just two roads, but a thousand roads instead
and each road is a different choice
and each choice is a different life

so what would you do? would you take a chance?
or would you play it safe and follow the pack?
or would you do what I do and take a leap of faith?
and see where the road takes you, without a map or a guide to show the way?

the choice is yours, the road is ahead
and the only thing that's certain is that you'll be dead
but the question is, what will you do with the time
that you have, before you say goodbye to the life you've known for so long?

Gemini-2.0-Flash

A heart, a universe, contained in dew,
A breath held captive, waiting to break through.
The world, a canvas, freshly scrubbed and bright,
Painted with wonder in the morning light.

A language learned in whispers, hums, and coos,
A grammar built on sunshine and on blues.
Each fallen leaf, a story yet untold,
Each muddy puddle, shimmering with gold.

Imagination, a rampant, thorny vine,
Climbing the oak trees, twisting and divine.
A cardboard box, a castle, strong and grand,
Ruled by a monarch with a stick in hand.

Fear is a monster, lurking in the dark,
Banished by lullabies and a firefly spark.
Grief is a fleeting shadow, quickly gone,
Replaced by laughter, echoing at dawn.

The scraped-up knees, a badge of courage worn,
The silent pacts in secret, whispered sworn.
The first lost tooth, a wish upon the breeze,
A fragile faith in rustling, ancient trees.

But time, a river, steadily flows on,
Washing away the colors, one by one.
The canvas fades, the language starts to fray,
The cardboard castle crumbles into clay.

Yet, deep within, a seed remains concealed,
A kaleidoscope of moments, once revealed.
And though the world may paint a different scene,
The child within, forever, evergreen.

CrPO-Llama-3.1-8B-Instruct-nov

The road forked in two different ways
A path of duty, of obligation's sway
The other led to dreams, of passion's fire
The choice was mine, but which heart's desire?

The road of duty was well-worn and bright
A path of expectation, of what's right
The other path was darker, full of strife
But it was the one that called to my life

The road of duty called to my head
A path of reason, of what others said
But the other path called to my heart
A path of dreams, of what I wanted to be from the start

I stood there for a moment, in indecision's grasp
Trying to decide which path to take at last
But then I remembered, why I started this quest
For the freedom to choose, to follow my heart's best

So I took the path, that was full of strife
The path that called to my heart, and my life
I walked it alone, with a sense of pride
For I chose my dreams, and I let my heart decide

I know that the road, of duty's call
Will be there for me, when I need it all
But for now, I'll follow, my heart's desire
And see where this journey, takes me on fire.

Claude-3.7-Sonnet

In the gallery of moments, each frame hangs suspended—
Pathways branching like veins from a beating heart,
Some gleaming with promise, others shadow-tended,
Each one a universe waiting to start.

We stand at the fulcrum, compass in hand,
The needle trembling between yes and no.
The weight of a whisper can shift the sand
That determines which gardens will grow.

Some choices are mountains that block out the sun,
While others, like pebbles, make ripples unseen.
The paths not selected still somehow run
In parallel worlds of what might have been.

Yet even in choosing, we never quite know
The full constellation our decisions will birth.
We plant tiny seeds and then watch as they grow
Into forests that alter the face of the earth.

So pause at the crossroads, breathe deep, and believe
That wisdom resides in your faltering hand.
For choice is the chisel with which we conceive
The sculpture of life that we never quite planned.

CrPO-Llama-3.1-8B-Instruct-div

An electrical storm within the mind
Electric signals sent and received
Each synapse a pathway to choose a line
But which one will we take and which one refuse

Like a conductor leading an orchestra
The brain plays out all the possibilities
Each one a different melody to create
But which one will we choose to enact

The neurons fire with varying degrees
Of intensity and of passion and of peace
Each one a different note to play
But which one will create the perfect melody to sway

And as we deliberate and weigh our fate
The signals clash and the synapses create
A rhythm and a harmony so true
That guides us to the choice we'll choose to do

But why do we choose the way we do?
Is it because of reason or of passion too?
Or is it because of what others say?
Or is it because of what we feel today?

The answer lies within the electrical storm
That rages within our minds and forms
A choice that's unique and one of a kind
A melody that's created by our minds.

Llama-3.1-8B-Instruct

A crossroads stands, where paths unfold,
A choice awaits, with stories untold.
A whisper beckons, a voice within,
To guide the heart, and let the spirit win.

The threads of fate, like silken strands,
Weave a tapestry, of possibilities grand.
Each step a decision, a door ajar,
A world of what-ifs, and might-have-beens to scar.

The weight of options, like a burden laid,
Presses upon the soul, with choices made.
Yet, in the stillness, a voice does rise,
A whispered truth, that echoes through the skies.

The choice is yours, to shape and mold,
To forge a path, where freedom unfold.
To dance with destiny, or defy its sway,
To chart a course, where heart and soul can play.

In this labyrinth of reflections, we roam,
Seeking answers, to the questions we've called home.
For in the depths of choice, we find our way,
To the essence of self, in a brand new day.

So let the crossroads call, with its siren's song,
And let the voice within, guide you all day long.
For in the choices we make, we find our might,
And in the paths we choose, our true selves take flight.

CrPO-Llama-3.1-8B-Instruct-sur

What is the threshold for desire?
Is it 'do I want the rest of my life this way'?
Is there a certain amount of suffering that I'm willing to endure?
A certain point that I will not go beyond?
Where is the limit on what I can tolerate?
Is it 'can I maintain my self esteem and dignity'?
Is it 'can I avoid a complete meltdown, a complete mental break'?
Can I maintain my motivation?
Can I maintain my will to live?
If I knew there was a point of no return, what would I choose?
If I knew that at a certain point I would be completely unfunctional, what would I choose?
What is the choice between life and death?
What is the choice between life and living?

CrPO-Llama-3.1-8B-Instruct-nov-div-sur

Do you go through the alley of your past,
where the few people that loved you are left to last,
if you continue to walk, you'll never know the pain,
You'll avoid hurt, but you'll feel empty and grey.

Or do you walk through the crowded square,
the noise will overwhelm, the people will confuse you there,
you'll know every face, but none will know you too,
and so you hide in the crowd to avoid being seen anew.

The society will dictate when to eat, and when to sleep,
it will force you to be productive, and silent when you want to speak,
You'll never know what happiness is like,
but at least you are free from heartache's sting.

Which is harder, I'd like to know, you and I,
which is pain, or freedom, do you have to decide?
can you live with the hurt, or not the chance to be you?
which choice is yours, I'd love to see you do.

CrPO-Llama-3.1-8B-Instruct-cre

Why do we have to choose the way we go?
Is it the only way to reach the goal we know?
What other roads did we not take the chance?
Will we ever know what we should've known?

Will we know that we could've gone fast?
Or took the slow route and seen it all at last?
Will we know that we missed out on what's best?
Or was it something that wasn't meant to pass the test?

What if we chose wrong and went the wrong way?
Was it a detour that took us further away?
Or was it a wrong turn that brought us here?
Is it a choice that makes us who we are today?

What if we had to make the same choice again?
Would we make the same choice or change the game?
Is it a choice that we want to make once more?
Is it a choice that we want to make forevermore?

Is it a choice that we make each day?
Is it a choice that we make with our lives?
Is it a choice that we make with our hearts?
Is it a choice that we make with our souls?

Is it a choice that sets us free?
Is it a choice that makes us who we want to be?
Is it a choice that makes us whole?
Is it a choice that makes us complete and full of soul?

Citation

@article{ismayilzada2025creativepreferenceoptimization,
      title={Creative Preference Optimization}, 
      author={Mete Ismayilzada and Antonio Laverghetta Jr. and Simone A. Luchini and Reet Patel and Antoine Bosselut and Lonneke van der Plas and Roger E. Beaty},
      year={2025},
      eprint={2505.14442},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.14442}, 
}