Anthropic CEO：3-6个月内 90%的代码将是AI写的 - 2025年3月12日北美华人网存档

tonner 发表于 2025-03-12 23:24
本大叔非马公，但是时不时需要代码来验证想法，以前一直有雇佣实习生或借用programmer做的需求，现在ChatGPT完全替代了这类的实习生，同时以前三个月做的东西现在一两个星期就可以了，用cursor这样的编辑器，基本上就是用对话方式让AI写代码

什么类型的程序？

千

千渔千寻

5 个月

回复 3楼 zhangfei123 的帖子
OpenAI研究人员做过测试了，较大的freelance的项目，现在最先进的大模型能解决的很少。
SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering? https://arxiv.org/pdf/2502.12115
We introduce SWE-Lancer, a benchmark of over 1,400 freelance software engineering tasks from Upwork, valued at $1 million USD total in realworld payouts. SWE-Lancer encompasses both independent engineering tasks — ranging from $50 bug fixes to $32,000 feature implementations — and managerial tasks, where models choose between technical implementation proposals. Independent tasks are graded with end to end tests triple-verified by experienced software engineers, while managerial decisions are assessed against the choices of the original hired engineering managers. We evaluate model performance and find that frontier models are still unable to solve the majority of tasks.