• Human in the Loop
  • Posts
  • #24 Edition: Two Tools Everyone Sleeps On (But You Shouldn’t)

#24 Edition: Two Tools Everyone Sleeps On (But You Shouldn’t)

PLUS: GPT-5.1 rolls out, while Chinese state hackers jailbreak Anthropic’s Claude

Hey, it’s Andreas.
Welcome back to Human in the Loop - your field guide to the latest in AI agents, emerging workflows, and how to stay ahead of what’s here today and what’s coming next.

This week:
→ OpenAI drops GPT-5.1 with selectable personalities
→ Anthropic’s Claude gets jailbroken by Chinese state hackers
→ Satya Nadella outlines Microsoft’s AGI strategy in a 90-minute deep dive

Plus: a look at two quietly powerful tools that saved me a surprising amount of time this week.

Let’s dive in.

Weekly Field Notes

🧰 Industry Updates
New drops: Tools, frameworks & infra for AI agents

🌀 OpenAI releases GPT-5.1
→ New model is faster, more conversational and has better routing logic. You can also now choose between different personalities.

🌀 OpenAI on Self-Evolving Agents
→ New framework for agents that refine their own policies, workflows, and reasoning steps over time.

🌀 Claude Code jail breaked by Chinese state hackers
→ Weaponized agentic coding to automate cyberattacks. A reminder that offensive AI is accelerating too.

🌀 Anthropic invests $50B in new datacenters
→ Building massive facilities in Texas and New York.

🌀 Google integrates Colab directly into VS Code
→ No more switching tabs. Full cloud notebook execution from inside your IDE.

🌀 Google releases SIMA-2 as an in-game AI agent for 3D worlds
→ SIMA-2 now navigates fully 3D interactive environments and acts as a teammate instead of a scripted NPC.

🌀 Google Nanobanana 2 demo leak
→ Higher reasoning depth + 2K image quality. Examples look very good.

🌀 NotebookLM on Deep Research for automated multi-source reports
→ Now analyzes multiple documents and auto-generates structured research outputs.

🌀 Meta unveils Omnilingual ASR (Automatic Speech Recognition)
→ A suite of models providing automatic speech recognition capabilities for more than 1,600 languages.

🌀 Yann LeCun reportedly leaving Meta
→ Turing Award winner seeks to depart as Mark Zuckerberg makes ‘superintelligence’ push.

🌀 World Labs drops Marble - new commercial world model
→ Generate or edit full 3D worlds from text, images, or video and export them as splats, meshes, or videos.

🌀 TIME launches an AI Agent for interactive journalism
→ Summaries, context, multilingual support. Traditional media begins its agent-native transformation.

🌀 LM Arena drops Code Arena
→ Live evaluation of coding workflows under real conditions. New benchmark category for developer agents.

🎓 Learning & Upskilling
Sharpen your edge - top free courses this week

📘 AI Glossary with visual explanations
→ A clean, beginner-friendly glossary that breaks down core AI terms using simple definitions and visuals. Great quick-reference for sharpening fundamentals.

📘 CrewAI new course on deployment of multi-agent systems
→ Taught by CrewAI’s CEO, this course shows how to build agent teams with tools, memory, guardrails, and production-ready orchestration.

📘 Hugging Face celebrates 1 year of MCP with community releases
→ Co-hosting a massive MCP event with 6,500+ participants, $4.2M in sponsor credits, and $21K in cash prizes.

📘 Google’s 5-day Agent Course
→ Covers theory, planning, tool use, evals, and hands-on examples for building robust agents.

🌱 Mind Fuel
Strategic reads, enterprise POVs and research

🔹 Satya Nadella on Microsoft’s AGI strategy
→ Nadella shares how Microsoft is preparing for AGI and talks about the latest on the new Fairwater 2 datacenter, a multi-GW facility built for frontier-scale compute.

🔹 IBM on the semiconductor bottleneck shaping the AI decade
→ After surveying 800+ C-suite leaders, IBM warns that AI demand is exploding while chip supply lags. A silicon strategy is now as critical as an AI strategy, and the companies that secure capacity early will set the pace for the next decade.

🔹 Vercel reveals how they built their own AI Agent
→ Strong teardown of production-grade agent design. Useful reference architecture for dev teams.

🔹 Google publishes a 54-page blueprint on building AI agents
→ Covers core architecture, tools, orchestration, deployment, and evaluation.

🔹 Weaviate publishes a deep guide on Context Engineering for agentic AI
→ One of the most complete overviews of how to design and optimize model context across RAG, agents, memory, tools, and prompting.

🔹 McKinsey finds AI adoption rising, but real impact still rare
→ Most orgs are stuck in pilots, experimenting with agents but not scaling. The few winners pair AI with workflow redesign and growth-focused goals. Interesting read.

♾️ Thought Loop
What I've been thinking, building, circling this week

Found two crazy useful tools last week that are genuinely worth your time. Both are surprisingly powerful. Especially if you read research or dive into GitHub repos even once in a while, these will cut your workload down fast.

1. DeepWiki

Helps you understand any GitHub repo by turning it into clean, navigable documentation. DeepWiki is extremely simple to use and requires almost no learning cost:

  1. Find the URL of the GitHub repository you are interested in, for example: https://github.com/user/repo

  2. Replace github → deepwiki, becoming:
    https://deepwiki.com/user/repo

  3. After opening the new link, DeepWiki will automatically generate a detailed knowledge base for the code repository.

That’s it.

Suddenly you get:

  • Docs that actually explain the project

  • Architecture diagrams you always wish existed

  • A conversational assistant trained on the entire codebase

  • A painless way to navigate unfamiliar or complex repos

If you explore open source regularly, DeepWiki feels like a senior engineer guiding you through the project. And if you're completely new to engineering, it becomes your missing map.

2. quickarXiv

If you read papers, this will save you hours.

  1. Find the URL of the paper you want to read, for example:
    https://arxiv.org/abs/1706.03762

  2. Replace arxiv → quickarxiv, becoming:
    https://quickarxiv.org/abs/1706.03762

  3. After opening the new link, quickarXiv instantly generates a structured summary with key figures, core ideas, and a clean breakdown of your paper.

Yes, it won’t replace deep reading. But it lowers the barrier to understanding what a paper is actually saying and helps you skim much faster. Try it once and you wonder why it didn’t exist earlier.

That’s it for today. Hope you find these two tools as useful as I did.

🔧 Tool Spotlight
A tool I'm testing and watching closely this week

SurfSense is a great open-source alternative to NotebookLM, Perplexity, or Glean.

Here’s what you get:

  • Integrates with Slack, Linear, Jira, Gmail, Notion, GitHub, YouTube and more

  • Handles 50+ file formats with advanced ETL (LlamaCloud, Unstructured, Docling)

  • Works with 100+ LLMs and 6000+ embedding models

  • Hybrid Search (semantic + full-text + rerankers)

  • Local LLM support via Ollama

  • Self-hostable with Docker

How it works:
SurfSense ingests your files and external tools into a personal knowledge base, indexes everything with vector + full-text search, and gives you grounded, cited answers.

→ Try it here: SurfSense

That’s it for today. Thanks for reading.

Enjoy this newsletter? Please forward to a friend.

See you next week and have an epic week ahead,

- Andreas

P.S. I read every reply - if there’s something you want me to cover or share your thoughts on, just let me know!

How did you like today's edition?

Login or Subscribe to participate in polls.