22 May 2025 11 min read ai

Daily Productive Sharing 1244 - AI Horseless Carriages

One helpful tip per day:)

Pete Koomen argues that many current AI features in software feel tacked on—unhelpful at best, counterproductive at worst—because they mimic old development paradigms and unnecessarily constrain what large language models (LLMs) can do:

When building software with AI, Pete feels like he can bring almost any idea to life in record time. It’s fun and incredibly empowering.
Gemini is a powerful model capable of writing great emails—but Gmail’s implementation actually gets in its way.
Anyone who’s used LLMs for writing has run into this problem. It's so common that many users subconsciously adapt their prompting to avoid it.
A simple solution, often overlooked: let users write their system prompt.
The key is to understand that all input and output is just text. The LLM interface is fundamentally a text box.
OpenAI and Anthropic introduced a prompt structure: System Prompt (set by developers) and User Prompt (set by the user).
The System Prompt defines how the model performs a task; the User Prompt defines what task to perform.
Think of it like a function: System Prompt is the logic, User Prompt is the input, and the model’s output is the result.
The problem with Gmail isn’t just a bad System Prompt—it’s that the user can’t change it.
Teaching an LLM to solve a problem your way—and seeing it succeed—is magical. Surprisingly, it can even be easier than teaching a human, because LLMs give immediate, honest feedback.
New technologies often fail early because they mimic outdated models.
The "horseless carriage" metaphor applies—early cars mimicked carriages, just with engines replacing horses, but didn’t yet embrace new design for new speed.
Similarly, Gmail uses "old-world thinking": bolting AI onto an email client instead of asking, what would email look like if it were designed for AI from day one?
Traditional software assumes developers must mediate between users and computers.
Developers define general behavior; users supply specific inputs.
In the new paradigm, users don’t need a middleman—they can directly shape how software works by editing System Prompts, which isn't hard to do.
Pete’s core thesis: if an LLM agent acts on the user's behalf, the user should be able to teach it how—by modifying the System Prompt.
For domain-specific tasks, experts should write the prompts. Accountants and lawyers will want to write their own because their work is so context-dependent.
At every company Pete has worked in, finance teams rely on Excel for exactly this reason: it’s flexible and fits highly specific needs.
Most AI apps should be agent builders, not static agents.
Most users won’t want to write prompts from scratch. Good builders offer templates or prompt assistants to help them create custom agents quickly.
Users also need interfaces to see agent behavior and refine prompts—like a lightweight email assistant builder, which enables fast feedback loops.
Tools define safe boundaries for agents. What a model can do depends on the tools it can access, and tool boundaries are more reliably enforced in code than in text prompts.
What Pete wants from an AI-native email client is simple: automate the tedious stuff so he spends less time on email.
For many people, the killer AI app will be something that offloads the tasks they don’t want to do—so they have more time for what they do enjoy.
The heart of AI-native software should be: maximize user leverage within a domain.
- An AI-native email client should minimize time spent on email.
- An AI-native accounting tool should minimize the time an accountant spends managing books.

If you enjoy today's sharing, why not subscribe

Need a superb CV, please try our CV Consultation

Pete Koomen 认为当下不少软件中的 AI 功能像是临时拼接上去的，不仅无用，甚至有时会适得其反。这是它们模仿了旧的开发方式，从而对所使用的 AI 模型施加了不必要的限制：

当我用 AI 构建软件时，他感觉自己几乎可以在极短时间内创造出任何他能想象的东西。AI 就像是一种强力工具，使用起来非常有趣。
Gemini 是一个令人惊叹的强大模型，它完全有能力写出优秀的邮件。不幸的是，Gmail 团队设计的应用反而阻碍了它实现这一点。
所有使用过大型语言模型（LLM）写作的人都体会过这个问题。它太常见，以至于我们很多人无意识地养成了规避这种情况的提示词写作策略。
有一个简单的解决办法，很多 AI 应用开发者却忽略了：让用户自己写系统提示词（System Prompt）。
关键要理解的是：所有的输入和输出其实都是文本。LLM 的用户界面，本质就是纯文本。
OpenAI 和 Anthropic 等 LLM 提供商采用了一种提示结构的约定：将提示词分为两个部分——System Prompt 和 User Prompt。原因是，在很多 API 场景中，System Prompt 是由开发者预设的，而 User Prompt 则是由用户输入的。
System Prompt 说明了模型如何执行某类任务，并被反复使用；User Prompt 描述的是具体要完成的任务。
你可以将 System Prompt 看作函数，User Prompt 是输入，模型输出就是结果。
问题不仅在于 Gmail 团队写了一个糟糕的系统提示词，更大的问题是：用户不能改它。
教会一个 LLM 用你自己的方式解决问题，并看着它成功地完成，这种体验是非常神奇的。意外的是，这甚至比教一个人还容易，因为 LLM 会立刻、诚实地反馈你的解释是否足够清楚。
每当一种新技术被发明，最早基于它构建的工具往往失败，因为它们太过模仿旧的做事方式。
“无马汽车”（Horseless carriage）指的是早期汽车设计仍然延续马车结构，只是把马换成了发动机，但并未根据更高速度重新设计整体结构。
类似地，Gmail 团队的“老世界思维”是：他们尝试给已有邮件客户端加上 AI，而不是从零开始思考：如果一个邮箱是为 AI 原生设计，它会长成什么样？
现代软件业的假设是：我们需要开发者作为我们与计算机之间的中介。
职责划分明确：开发者定义软件在通用场景下该如何工作，用户输入内容来决定具体行为。
而在新世界中，用户不再需要一个中间人来告诉计算机做什么。用户只需要能自己写 System Prompt，而写 System Prompt 并不难！
他在本文的核心主张是：当一个 LLM 代理代表用户行动时，用户应该可以通过修改 System Prompt 来教会它如何工作。
用于完成某项任务的 System Prompt，应该由那个领域的专家来写。大多数会计和律师也会想要自己写 System Prompt，因为他们的专业知识是高度上下文相关的。
在他工作过的每一家公司中，财务团队都有这种情况。这也是为什么那么多财务工作依然依赖 Excel ——因为它是一个通用工具，可以适用于无限具体的情境。
大多数 AI 应用应该是“代理构建器”（agent builders），而不是直接的“代理”。
大多数人可能并不想从零开始写每一个提示词，而优秀的代理构建器不会强迫他们这样做。开发者会提供模板或提示词助手，帮助用户快速搭建属于自己的代理。
用户还需要一个界面，可以用来查看代理的执行结果，并基于反馈修改提示词。这就像我前面举例的“小型邮件助手构建器”一样。它提供了一个快速反馈循环，能帮助用户教会代理稳定地完成任务。
工具（tools）是为代理提供安全边界的关键。代理是否能执行某项任务，取决于它可以访问哪些工具。通过代码实现工具边界，比用文字写在 System Prompt / User Prompt 中要可控得多。
这就是他真正希望从一个“AI 原生邮箱客户端”中得到的东西：自动化琐碎工作，让他少花时间在邮件上。
对许多人来说，AI 的“杀手级应用”将是这样一种形态：教会计算机处理我们不喜欢做的事情，从而腾出时间做我们喜欢的事。
AI 原生软件的核心应该是：在某一领域中最大化用户的杠杆。AI 原生邮箱客户端应该最大程度减少我处理邮件的时间；AI 原生财务软件应该最大程度减少会计处理账目的时间。

如果你喜欢的话，不妨直接订阅这份电子报 ⬇️

Dr Selfie

You might also like...