13 May 2025 6 min read DPS

Daily Productive Sharing 1237 - Building Windsurf

One helpful tip per day:)

Windsurf was recently acquired by OpenAI for $3 billion, and its founder Varun Mohan shared key insights in an interview with The Pragmatic Engineer:

In AI product development, eval testing plays a role similar to unit or integration testing in traditional software—critical for reliability.
AI-powered integrated development environments (IDEs) are changing how engineers work, reducing mental load and boosting confidence:
- Engineers are now more willing to dive into unfamiliar codebases without waiting to consult someone familiar.
- Many developers turn to AI first for help, rather than interrupting teammates.
- Mental fatigue is reduced, as repetitive, tedious tasks can be offloaded to prompts or AI agents.
Varun emphasizes that tools like Windsurf don’t replace great engineers—they simply change the nature of their work and amplify productivity potential.
For example, when forking VS Code, Windsurf couldn’t rely on existing plugins like the Python language server, remote SSH, or dev containers. The team had to build them from scratch—a massive effort that end users might never even notice.
For API providers, “time to first token” is crucial. But that doesn't mean 100 ms is always fast enough. The real goal is sub-second latency with throughput of hundreds of tokens per second.
Using the CPU vs GPU analogy: GPUs offer 100x the compute power of CPUs—but only about 10x the memory bandwidth.
That means if your task isn’t compute-heavy, it can be bottlenecked by memory bandwidth. To unlock a GPU’s full power, you need to parallelize aggressively. But if those tasks aren’t ready in time, latency rises. It’s all about multi-dimensional trade-offs.
Fill-in-the-middle completion is essential in code editing (unlike chat apps). For instance, if the user types “RETU”, the model should predict “RN” to complete “RETURN.” This seems minor, but it’s crucial for usability.
Fill-in-the-middle isn’t something you can just bolt on later—you need model training or pretraining built for it. For Windsurf, supporting it was a must-have feature, which is why they built their own models and training strategies early on.

If you enjoy today's sharing, why not subscribe

Need a superb CV, please try our CV Consultation

最近 Windsurf 刚刚被 OpenAI 以30亿美金收购，正好他的创始人 Varun Mohan 接受了 The Pragmatic Engineer 的采访：

在 AI 产品开发中，“评估测试”（eval testing）就像传统软件开发中的单元测试或集成测试。
AI 驱动的集成开发环境（IDE）让工程师变得更无畏，减轻了心理负担。
1. 现在的工程师更敢于跳入不熟悉的代码区域，在过去，他们通常会等到和更熟悉代码的人交流后再行动。
2. 越来越多开发者在需要帮助时，首先求助 AI，而不是打扰他人。
3. 心理疲劳也减少了，因为那些重复而枯燥的任务可以交由提示或 AI 代理完成。
Varun 强调，他并不认为像 Windsurf 这样的工具会让优秀工程师变得不再必要：它只是改变了工作的性质，同时还能提高产出潜力。
例如，在 fork VS Code 时，该版本不能使用 Python 语言服务器、远程 SSH 和 dev containers 等插件。Windsurf 团队不得不从零开始开发这些插件，花费了大量时间，而用户可能根本注意不到有什么不同。
对于 API 提供方来说，“首个 token 响应时间”非常重要。但并不是说 100 毫秒就一定够快。我们的目标是让它低于几百毫秒，同时每秒能生成几百个 token。
如果类比 CPU 与 GPU，GPU 的计算能力比 CPU 高出大约两个数量级。最新一代 GPU 的差距甚至更大，但请记住一点：GPU 的内存带宽只比 CPU 高一个数量级。
这意味着，如果你的任务不是计算密集型，那它就容易受限于内存带宽。为了发挥出 GPU 的计算潜力，你必须并行处理大量任务。但如果你必须等待这些并行任务准备好，那延迟就会被拉高。所以我们需要在多个维度上做出权衡。
“中间补全”（fill-in-the-middle）的概念是：写代码的过程与聊天式应用不同。例如，你打下 RETU，我们就需要预测“RN”（RETURN）。这听起来像个微不足道的小细节，但如果你想打造一个真正实用的产品，这点非常关键。
中间补全并不是后期能轻易加入的功能。你必须在现有模型上进行训练，或者在预训练阶段就设计好这一能力。对我们来说，能否为用户提供中间补全是基本门槛。这也迫使我们很早就建立起自己的模型，并开发出相应的训练策略。

如果你喜欢的话，不妨直接订阅这份电子报 ⬇️

Dr Selfie

You might also like...