In 1985, Microsoft shipped Goal Seek in Excel. A few years later came Solver and Monte Carlo simulations. Instead of guessing outcomes, you could define success and let the software search for the variables that produced it.
I remember using these tools during my years in banking, often grinding the CPU to a halt. Solver, VBA, lookups and pivot tables gave ordinary users enormous leverage. At one point I automated almost my entire predecessor’s role into a handful of workflows that took 25 minutes a day to run.
The more I use Claude Autoresearch, Codex’s /goal functionality and similar optimisation frameworks, the more they remind me of Solver. The interface has changed, but the principle hasn’t: define success, explore possibilities, and let the machine discover what works.
We didn’t call it AI then. Yet here we are again.
Rather than talk about it abstractly, let’s use a real example.
Example: Perfecting Content at HM.com
I recently ran an Autoresearch optimisation exercise across Human Made’s content archive. After more than 300 experiments, it consistently converged on the same combination of variables:
- Headlines that take a clear position.
- Articles between roughly 800-1,000 words.
- Content published in the AI & Future category.

This combination produced the highest simulated engagement score across years of historical content.
While the result itself is interesting, it’s also worth noting how quickly it was possible to discover. It took me 37 minutes from start to finish to get a full visual report.
To run the experiment, I combined three sources of data:
- Content (WordPress export/REST API)
- Analytics (GA, Posthog, etc.)
- Claude/Codex (using /goal, autoresearch or other looping optimisation tools).
Everything was merged into a single dataset containing content, metadata and performance metrics. From there, Autoresearch ran hundreds of experiments, continuously adjusting variables such as topic cluster, headline format, category and word count in pursuit of one objective: maximising engagement.
Other insights that shook out of this analysis:
The top posts share one thing: they answer a question someone was already asking. The #1 post took a clear position on something that has multiple takes. The #2 all-time post solved a specific, painful admin UI problem. No post in the top 10 is vague, all of them snipe into a defined subject with tactical instructions or advice.
2023 was the highest-output year -> 44 posts, but also had the worst average engagement of any year with real volume (0.13). 2026 has 11 posts and a 1.83 average (over 10x vs 2023). The data doesn’t suggest publishing more. It suggests picking winning ideas when you have them.
AI & Future posts average 2.3× the engagement of everything else combined. Beyond the “hype”, it’s really what the audience is clicking on, reading, and spending time with.
As you can see, you can uncover and drive evidence-backed optimisations. Given the speed and ease-of-use, that’s incredibly powerful. This goal-directed research loop is, in many ways, yesterday’s Solver (but for content teams today, albeit 30 years later).
Another insightful chart: The data landscape feeding into Autoresearch can also be seen as weights, taxonomies, strengths, etc. Here we see how some pages such as “Home” or “Career” need to be excluded from training data, as things such as target audience and purpose will vastly skew the data. Thus, we really only ran the backtesting on eligible content:

Word count was one of the Autoresearch variables, and had a decent impact. It doesn’t move the objective function as much as topic cluster, headline format and others, but still worth tracking. The below shows engagement vs word count.

From here on out, you can use multiple other sources from search/SEO data sources, product analytics, and other variables. The opportunities are endless, especially once you mix in financial attribution/touchpoints.
Turns out we’re once again at an inflection point, that same feeling Excel gave us when it magically automated a huge parcel of work from one day to the next. We don’t need expensive BI tools or long integration projects to drive high-leverage plays on a day-to-day basis. We can do everything from our own computers with existing tools, and within minutes once automated.
I’ll be at WordCamp Europe this week in Kraków, and hosting an AI-focused webinar online next week. Grab me for a chat or get in touch if you’re experimenting with anything similar.
