Human in the Loop
Posts
#5 Edition: AI Agents and the Future of Work: A Massive Mismatch?

#5 Edition: AI Agents and the Future of Work: A Massive Mismatch?

PLUS: IBM builds trust infrastructure for AI Agents and OpenAI shows what agent-powered customer service really looks like

Andreas Horn
June 22, 2025

Hey, it’s Andreas.
And welcome back to Human in the Loop.

This week was insanely packed. Honestly, I’m thinking about shifting to two editions per week — there’s just too much happening in the space.

What do you think: Keep it weekly? Or would a second, shorter mid-week drop be helpful?

This week’s edition covers:

IBM’s full-stack AI governance launch, OpenAI’s support agent demo, and Anthropic enables Claude Code to access tools via remote MCP servers.
Stanford dropped first-ever study on what workers actually want from AI Agents.
and much more…

Also — I’ve been jamming on a vibe-first coding app. More on that in one of the next editions.

Let’s go!

Weekly Field Notes

🧰 Industry Updates
`New drops: Tools, frameworks & infra for AI agents`

🌀 IBM launches first-ever unified AI security + governance stack for AI agents
→ First integrated platform to detect shadow agents, red team AI systems, and ensure compliance at scale. It offers full-lifecycle visibility, policy automation, and regulatory accelerators.

🌀 OpenAI open-sources a full customer support agent demo
→ Built using their own Agents SDK. Real-world template, which is worth studying. They probably killed a few startups with that.

🌀 11x AI shows how they rebuilt their AI SDR “Alice” with LangGraph + LangChain
→ Straight from the field. Includes stack, logic and lessons learned in only 20min.

🌀 Anthropic enables Claude Code to access tools via remote MCP servers
→ Remote control = richer context = stronger agents.

🌀 Apollo Tyres uses agentic AI to extract insights from manufacturing data
→ A clean industrial use case on AWS. Interesting case study, well documented.

🌀 Ericsson presents Telco Agentic AI Studio
→ A full-stack platform to build, test, and deploy domain-specific agents for telecom. Uses “worker agents” to orchestrate specialized LLMs — handling everything from customer personalization to network ops.

🌀 ElevenLabs enables MCP support for conversational agents
→ Adds MCP to unlock tool and data access via external servers. Includes flexible security modes (Always Ask, Fine-Grained, No Approval) and integration with platforms like Zapier.

🎓 Learning & Upskilling
`Sharpen your edge - top free courses this week`

📘 Build AI Agents with Mastra AI
→ Mastra’s hands-on course teaches you how to build AI agents from inside an agentic code editor. Available via MCP.

📘 AI Engineering Hub by Daily Dose of Data Science
→ Packed with 70+ hands-on projects in MCP, RAG, and AI agents. If you’re building or learning in public, this repo is a goldmine. Now trending (already 10k+ stars on GitHub) — and worth starring if you haven’t already.

P.S. Got a good new course? Send it my way on LinkedIn or just hit reply.

🌱 Mind Fuel
`Strategic reads, enterprise POVs and research`

🔹 ISG Research maps the state of agentic AI
→ TL;DR: 70% of use cases land in BFSI, retail, and manufacturing. Most agents are still task-based (43%), with goal-based just starting to emerge. Only 25% operate independently — most still advise, not act. Key blockers: skills, data gaps, and shaky business cases. Includes a 29-vendor buyer’s guide (Vertex AI, Azure, etc.). Great to see IBM leading here — and our work aligning well with what enterprises actually need.

🔹 MIT study: Your brain accumulates “cognitive debt” when you overuse ChatGPT
→ Conducted over 4 months, the study found that students using ChatGPT showed up to 55% lower neural activity, wrote less original essays, and struggled to regain performance when switching back to solo writing. Early overuse builds cognitive debt. Timing matters — AI works best after you’ve thought first.

🔹 Capgemini drops blueprint for scaling enterprise AI
→ Frameworks, operating models, and real-world cases from BMW, ABN AMRO, and Eneco show how to move from pilots to production. TL;DR: Success depends on alignment across data, governance, KPIs, and talent — all tightly linked. Agentic AI projects are projected to grow 48% in 2025, making scale the new priority.

🌀 BCG releases playbook for building an AI-first company
→ A sharp guide on how to lead with AI at scale. Covers business-led agendas, org-wide adoption, workforce impact, real ROI, and how to reallocate funding to what works. Clear message: the best orgs don’t wait — they move fast, scale smart, and double down.

🌀 IBM releases 299-page playbook on how to create real business value with AI
→ Titled AI Value Creators, this dropped a few weeks ago — but might be one of the most valuable shares from me. Written by IBM execs actually building at scale, it covers how to move from prototype to platform, navigate governance and org design, and embed AI into your core operating model. Strategic, practical, and free.

♾️ Thought Loop
`What I've been thinking, building, circling this week`

This week I came across one of the most thoughtful studies on AI agents and the future of work — from Stanford.

Link to the study well summarized with graphics.

They surveyed 1,500+ workers across 104 occupations. And asked a simple but powerful question: What tasks do you actually want AI to automate?

And the results show a massive disconnect between what people want — and what the tech industry is building. Turns out, 41% of YC-backed startups are focused on automating tasks that workers don’t even want touched.

What do people want instead? Not to be replaced — but to be relieved. They’re craving help with the admin sludge: scheduling, data entry, repetitive clicks. But when it comes to creative thinking, judgment calls, or meaningful work — they want partnership, not substitution.

Human Agency Scale (HAS)

At the heart of the study is their newly introduced framework Human Agency Scale (HAS), a five-level system designed to quantify the degree of human involvement desired in various tasks:

→ H1: AI does it fully alone
→ H2: AI needs light input
→ H3: Human and AI collaborate
→ H4: AI depends on human input
→ H5: AI can’t function without you

This nuanced approach reveals that different levels of human input suit different AI roles, challenging the notion that higher automation is always preferable.

Takeaway
Higher automation ≠ better.
Some tasks thrive under AI control. Others require human nuance.
Ignore this — and you’ll build agents nobody trusts.

This isn’t a UX problem. It’s a trust problem.
If you’re scaling agents, ask: Are we building what people actually want — or just what demos well on stage?

🔧 Tool Spotlight
`A tool I'm testing and wachting closely this week`

I came this very cool tool named Sketch, which is an agentic coding tool that runs in your terminal (or browser) and understands your full codebase.

How it works:

Creates a Dockerfile
Builds it
Copies your repo inside
Spins up a sandboxed container with Sketch running inside

Each sketch is fully isolated — safe to run in parallel, impossible to mess up your setup. Under the hood, it chains tools and shell commands to let the LLM act on your code directly. That’s real autonomy and a pretty solid integration. Feels like how agent-native dev should work. It’ll be interesting to watch how this evolves — huge potential for everything related to DevOps tasks.
→ Explore Sketch on Github.

That’s it for today. Thanks for reading.

Enjoy this newsletter? Please forward to a friend.

See you next week and have an epic week ahead,

— Andreas

P.S. I read every reply — if there’s something you want me to cover or share your thoughts on, just let me know!

#5 Edition: AI Agents and the Future of Work: A Massive Mismatch?

PLUS: IBM builds trust infrastructure for AI Agents and OpenAI shows what agent-powered customer service really looks like

Weekly Field Notes

🧰 Industry UpdatesNew drops: Tools, frameworks & infra for AI agents

🎓 Learning & UpskillingSharpen your edge - top free courses this week

🌱 Mind FuelStrategic reads, enterprise POVs and research

♾️ Thought LoopWhat I've been thinking, building, circling this week

🔧 Tool SpotlightA tool I'm testing and wachting closely this week

How did you like today's edition?

🧰 Industry Updates
`New drops: Tools, frameworks & infra for AI agents`

🎓 Learning & Upskilling
`Sharpen your edge - top free courses this week`

🌱 Mind Fuel
`Strategic reads, enterprise POVs and research`

♾️ Thought Loop
`What I've been thinking, building, circling this week`

🔧 Tool Spotlight
`A tool I'm testing and wachting closely this week`