Five new things I’ve learned from comparing AI and human-written headlines
Share this story:
Using natural language processing, I have broken down thousands of headlines to see what words Large Language Models (LLMs) favor in headline writing.
As LLMs are used more to write, it is important to remember that these AI models have been trained on almost everything online written by humans. This practice of analyzing headlines is really just a way to learn how LLMs use human-made data to make decisions.
My focus over time has been using technology to help newsrooms in their workflow; however, it may prove to be helpful. That’s why I created YESEO, a free SEO Slack tool designed for news publishers. Users of the tool have processed over 18,000 stories, creating thousands of headlines. YESEO is also the source of the data used in this analysis.
You may have seen me present similar information to this before with the Online News Association, Lenfest Community of Practice or at the University of Iowa. If you haven’t, here are the highlights from what I learned the last time I crunched headline data:
- “Rising” was the most used verb amongst all headlines generated by AI in YESEO. The verb “reveals” ranked second and — the word I thought I had seen too many times — “discover” ranked 11th.
- Less than 1% of all YESEO headlines use the word “says,” and 80% of the time, a comma comes before “says.”
- 8% of YESEO-generated headlines use one of these four words: local, high, new or latest.
- Impact was 130x more likely to show up on an LLM-generated headline than affect or effect.
One crucial change I made for this new headline dataset analysis was in the non-AI sample data. I wanted a larger dataset of published, presumably human-written headlines to compare against the AI-written ones. So, this time around, I have 130,000 published headlines from various news websites across the U.S.
The words AI uses most
The first thing I did was dig into verbs and compare what is more likely to be used by an AI-generated headline in YESEO versus an article on the internet. I sorted through words that showed up more than 10 times in both samples and then ran a test of which were extremely likely to be from LLMs.
If I asked you which of these are real headlines and which of these are generated by an LLM for a headline idea? Try to decipher which is which from these:
Which of these headlines were generated by an LLM? (Select all that apply.)
If you guessed that 1 and 3 were made by an LLM, then you are right. It turns out words like “navigating”, “empowering” and “experience” are over seven times more likely to be used in an AI headline suggestion. “Discover” also rears its ugly head again.
I also found it fascinating that LLMs only generated the word navigating to start a sentence or after a colon. This also means the models would almost always capitalize the word. For example:
- Navigating Trauma: How Journalists in Los Angeles Can Heal After Wildfires
- Navigating Health Care Amid Immigration Enforcement: Challenges for Providers and Patients
- Navigating Law Enforcement: Legal Aid Society Educates Bronx Youth
- The Art of Life After a Title: Navigating a New Chapter with Purpose
Words that both AI and humans use a lot
In seeking out the most common nouns used by both LLMs and published stories, I was able to find a range of words that were commonly used by both. All of the headlines below use words that are common to both LLM headlines and published headlines, making it a little harder to tell which one was generated by an LLM.
- Lawsuits allege violations of free speech by state superintendent and advisor
- Failed traffic stop escalates into high-speed pursuit, ends with arrest
- Comcast Explores Potential Offer for Part of Warner Bros Discovery
- As they wait and work for Everglades restoration, the swamp keeps dying
- Looking at how the economy impacts elections
Each of these headlines were generated by an LLM. All the underlined words above were commonly found in the LLM dataset and the published articles. A few more words not in one of the example headlines above include offender, individuals and adventure.
AI really doesn’t like to use ‘effect’
“Impact” showed up equal amounts of times in both AI and published headline data, but it turns out AI rarely uses the word “effect.”
Effect is 19 times more likely to be used in a published headline than a headline created by an LLM. Only one of these below is generated by a large language model, do you know which one?
Which of the following headlines were generated by an LLM?
If you guessed two, you are right.
Punctuation matters some, but not always
If you have used any AI to write for you, you may have noticed an extra comma, or a capital letter or two, or all of them. You may have also noticed a period where you didn’t think to put one.
It turns out that LLMs tend to write more colons into their headlines by a large margin. Looking at the percentages of headlines that contained each, published headlines included almost twice as many commas as a percentage rate compared to AI.
- Improve Your Sleep, Improve Your Heart Health
- Mussel shells are changing as the ocean warms, study finds
- First-time mothers are older than ever, CDC says
- Six schools, one team? Burgeoning of high school co-ops brings on extremes
All of these, with their many commas, were from published headlines — presumably written by humans.

If we assume humans wrote all the headlines in the published dataset, then humans were almost as likely as an LLM to put more than one comma in a headline. 16.63% of AI-generated headlines had more than one comma from a headline suggestion in YESEO, but the sample of human headlines had 18.44% contain multiple commas.
Words, AI and humans
I focused a lot on words in this analysis, because words are important. They have cultural significance, they can include bias and the meaning can change, too. The words AI leans on are a result of human data — they learned from us after all. So, don’t take away that humans should avoid using words that AI often generates.
With YESEO, humans seeking headline help are given five ideas written by AI, but it is always up to the journalist to use a suggestion. If there is a better word that could be substituted into a generated headline, a human editor can change it.
The point of the tool is to help, not to replace. YESEO is just the source of an idea that humans can use to optimize their work.
I was recently visiting West Virginia University as an Innovator in Residence, and one thing I pressed to students was to keep writing so they can find their written voice. Having a written voice in a world where LLMs begin to homogenize writing style will be more important than ever.
That is why AI tool makers like myself need to keep testing, iterating and building better systems — all while keeping humans in control of the final product.
AI transparency note: The interactive quizzes in this story were coded with assistance from Google Gemini.





