2026-05-01 / 5 min read | AIContentRAG

How to Automate Content Without It Sounding Like AI Wrote It

Most AI content is obvious. Em dashes everywhere, Oxford commas, the same generic structure. Here is how to actually automate content that sounds like you.

So you can always tell when someone's used ChatGPT or Perplexity to write their content right. It's obvious. The em dashes everywhere, the Oxford commas in every list, the same "In today's rapidly evolving landscape" opener, the "Let's dive in" transition, the neat little summary at the end that nobody asked for. It's AI slop and people can spot it immediately.

And the thing is I'm not against automation. I automate basically everything. But there's a massive difference between automation that provides value and automation that just fills space with generic rubbish. Most people using AI for content are doing the second one because they haven't put any effort into making it sound like them.

The problem with default AI

When you open ChatGPT and say "write me a blog post about X" you get something that sounds like every other ChatGPT blog post. Because the model's default voice is this polished, slightly formal, endlessly agreeable tone that nobody actually talks like. It uses em dashes constantly, it puts Oxford commas in every list, it hedges everything with "it's worth noting" and "it's important to remember" and it wraps up with a neat conclusion that sounds like a school essay.

People read that and they know. They might not be able to articulate exactly what gives it away but they feel it. It reads like nobody wrote it. And once you've lost that trust you've lost the reader.

What actually works is a personal RAG database

So the approach I've been building and using for clients is completely different. Instead of letting the AI write in its default voice you build a corpus of how the person actually talks and thinks. Then you use RAG (retrieval-augmented generation) to pull from that corpus when generating content.

I built this system for a client called Antonio. We scraped his Facebook posts, his YouTube transcripts, his assessment content, his workshop materials. Hundreds of pieces of content that he actually wrote or said in his own words. All of that goes into a knowledge base using LightRAG which builds both vector embeddings and a knowledge graph of the concepts and relationships in his content.

So when we generate content for him the AI isn't inventing things in a generic voice. It's pulling his actual phrases, his specific frameworks, his exact way of explaining concepts. If he says "if you're not keeping score you're just practising" that exact phrase shows up when it's relevant. Not some AI paraphrase of it. His actual words.

The technical setup

The stack is pretty straightforward. I built a knowledge MCP server that wraps LightRAG and exposes it as tools that Claude can query. The knowledge bases support multiple query modes, vector similarity for finding semantically related content, entity-centric search for finding specific concepts and relationship-focused queries for understanding how ideas connect to each other.

The corpus gets populated from multiple sources. Social media posts, video transcripts, written articles, email newsletters, voice recordings that get transcribed. Basically anything the person has said or written that represents their authentic voice. The more diverse the source material the better the voice matching gets.

And critically the corpus needs to be from before the AI era. If someone's been using ChatGPT to write their LinkedIn posts for the last two years and you scrape those as training data you're just going to get the AI voice back. You need the authentic stuff from when they were actually writing it themselves.

Sanitising the output

Even with a good RAG database the AI will still try to inject its own patterns. So you need explicit rules. No em dashes. No Oxford commas. No "it's worth noting" or "let's dive in" or "in conclusion". No hedging phrases. No generic openers. You build a voice profile that captures how the person actually writes and you enforce it.

For my own content I've got a voice profile that captures things like how I use "right" as a filler, how I connect ideas with "and" and "but" and "so", how I tend to be practical before philosophical, how I never use formal academic language. Every piece of content gets checked against that profile and anything that sounds like default AI gets rewritten.

Turning it into a workflow

Once you've got the corpus and the voice profile set up content generation becomes something you can genuinely automate. Not in a "press a button and get slop" way but in a way that actually provides value.

I can record myself talking about a topic for ten minutes, get it transcribed, feed it through the RAG system with my voice profile and get back a blog post that sounds like me because it's built from how I actually talk. The blog posts on this site were written exactly this way. Voice transcripts processed through a system that knows my patterns.

You can hook this into n8n workflows or cron jobs that run on a schedule. Monitor your inbox for interesting topics, pull context from your knowledge base, generate a draft in your voice, queue it for review. Content stops being a chore and becomes a pipeline that runs in the background.

The key insight

The difference between AI slop and AI-assisted content is effort upfront. Building the corpus takes time. Creating the voice profile takes time. Setting up the sanitisation rules takes time. But once it's done you've got a system that can produce content indefinitely that actually sounds like you and actually says things worth reading.

Most people skip all of that and just prompt ChatGPT with no context and wonder why their content sounds generic. The AI is only as good as what you give it. Give it nothing and you get nothing. Give it your actual voice and your actual ideas and you get content that people can't tell apart from something you sat down and wrote yourself.

That's the goal right. Not replacing yourself. Scaling yourself.

← ALL POSTS GET IN TOUCH →