The AI Confidence Game

And why 'vibe checks' are the new technical debt.

Large language models are masters of plausible nonsense. That's becoming a huge problem.


LLMs Are Brainwashing Us Into Overconfidence

Why the 'can-do' attitude of your AI assistant is a dangerous trap.

There's a weird psychological effect of using large language models. Their confident, plausible responses make us feel like anything is possible, a phenomenon one developer calls being 'brainwashed' into overconfidence. It’s a seductive illusion that makes incredibly hard problems seem trivial.

This isn't just about individual developers getting cocky; the real risk is systemic. We start trusting the confident output, building entire products on assumptions generated by a machine trained only to sound convincing. This normalises a 'vibe check' approach to engineering, where we stop rigorously questioning the foundations of what we're building and accept fluency as a proxy for accuracy.

This is where the big tech platforms benefit, concentrating power by providing tools that feel magical but obscure the messy reality. The smart move is to cultivate a deep-seated skepticism. Question the output, verify the claims, and never let an AI's confidence override your own critical judgment.

Read more →


Everyone is Building an AI Agent Builder

The new gold rush isn't just using AI, it's selling the shovels to build more AI agents.

Brick Coder AI: Your new drag-and-drop AI factory.

This turns AI agent creation into a visual puzzle, abstracting away the code entirely. It’s ideal for rapid prototyping but raises questions about the maintainability of complex, visually-programmed logic.

Heep: The AI receptionist that never sleeps.

Heep applies the agent model to a specific vertical, automating restaurant bookings. This is where agents win: solving a single, repetitive, high-value business problem without needing a developer.

Clips by xdge.ai: The AI that attends meetings for you.

This moves the agent from a builder's tool to a knowledge worker's proxy. The dream is to escape meetings, but the reality might be a new layer of AI-generated summaries we also have to manage.


Making the AI Sausage

While everyone's focused on the AI magic show, the real work is in the plumbing and the editing suite.

Stax: Google's toolkit for ditching the 'vibe check'.

This is Google Labs' attempt to bring rigour to LLM evaluation, moving beyond gut feelings. It’s a necessary, unglamorous tool that could finally help us build more reliable AI products.

TextJam: The multiplayer editor where AI is a collaborator.

Instead of just generating text, TextJam puts the AI right in the document with you and your team. This shifts AI from a solo tool to a genuine participant in the creative process.


Quick hits

Katalog: The read-it-later app that talks back.
Your reading list just got a voice, letting you talk to your articles to take notes or ask questions while you listen.

Speech to Note: Your brain-to-note pipeline just got an upgrade.
An AI scribe that turns your voice memos into organised notes, making brain dumps frictionless and completely keyboard-free.


My takeaway

The gap between what AI can do and what it should do is where all the valuable work is happening now.

We're drowning in tools that make building AI agents feel trivial, but have very few that help us evaluate them critically. This creates a dangerous feedback loop where we trust the confident-sounding output without real validation. The result is a mountain of 'vibe-driven' products built on shaky, unexamined foundations.

The next wave of innovation won't come from just building more, but from building smarter with the right guardrails and evaluation. It's about developing taste and critical judgment in an age of automated plausibility. This is the moment to become an AI-augmented architect, not just a prompter.

What's one assumption an AI has given you that you later realised was completely, confidently wrong?

Drop me a reply. Till next time, this is Louis, and you are reading Louis.log().