接近奥数金牌水平!谷歌DeepMind的AI新突破,“构建AGI关键一步”

E
Eminemer
楼主 (北美华人网)
h
hankzhang
Chatgpt做amc8都胡扯。有的时候结果还是对的
b
badgerbadger
数学也要被AI攻克了吗?🤯
太可怕了,这还玩个锤子。以后是不是要和老墨争当contractor了。
X
Xiaguagua
回复 2楼hankzhang的帖子
这说明它没有逻辑最后把答案一放就是chatgpt 了
b
badgerbadger
Chatgpt做amc8都胡扯。有的时候结果还是对的
hankzhang 发表于 2024-01-17 23:55

这不是GPT,是AlphaGeometry,它的解答是verifiable的,所以只要有结果就不会出错。
这篇论文刚发到Science Trinh, T.H., Wu, Y., Le, Q.V. et al. Solving olympiad geometry without human demonstrations. Nature625, 476–482 (2024). https://doi.org/10.1038/s41586-023-06747-5
f
fibril
这不是GPT,是AlphaGeometry,它的解答是verifiable的,所以只要有结果就不会出错。
badgerbadger 发表于 2024-01-18 00:00

这个怎么验证的?
g
gongmaa
以后就像Matrix里一样,在脑袋里装个芯片,直接把需要的知识load进去,不需要卷藤校了
b
bhuahua
chatGPT连简单的四则运算都算不对
湫湫
搞点烧菜做饭,照顾老人的AI吧,别尽弄些数学画画写作虚头巴脑的
b
badgerbadger
这个怎么验证的?
fibril 发表于 2024-01-18 00:05

机器验证logical proof存在已久了吧
h
hoxu
尽 tmd 瞎掰
y
youdai
Deep Mind在这方面经营很多年了。ChatGPT表示对做数学作业不感兴趣。
b
badgerbadger
Deep Mind在这方面经营很多年了。ChatGPT表示对做数学作业不感兴趣。
youdai 发表于 2024-01-18 01:06

是这样的,Google还是爸爸。
真让人瑟瑟发抖。我们以为苦学数学就能跑赢AI,可能还是太天真了。
下棋,人类研究了上千年,结果现在世界冠军沦落到靠AI肛塞作弊的路数。难道数学也要这样了?
L
Leavesday
搞点烧菜做饭,照顾老人的AI吧,别尽弄些数学画画写作虚头巴脑的
湫湫 发表于 2024-01-18 00:35

鼠目寸光,人类的科技进步不是靠研究烧菜做饭照顾老人生活起居,自然科学的基础研究进步才会真正推动人类进步。烧菜做饭照顾老人有实际用途,但是没有科学家会去研究这个,你还是说给那些想赚钱的人听吧,他们会有兴趣。
h
hoxu
不久前 deep mind 不还号称解了数学难题。其实就是扯。从上到下不要脸
f
flaminglotus
没事,我还有做饭技能。以后可以给你们AI工程师开饭馆。
a
annabel1832
这下不用卷奥数了。 让ai总结规律,背背就能做奥数题了。
d
drowned123
以后都是AI和AI打架,和人没关系了
单身男
A.I.’s Latest Challenge: the Math Olympics Watch out, nerdy high schoolers, AlphaGeometry is coming for your mathematical lunch.

For four years, the computer scientist Trieu Trinh has been consumed with something of a meta-math problem: how to build an A.I. model that solves geometry problems from the International Mathematical Olympiad, the annual competition for the world’s most mathematically attuned high-school students. Last week Dr. Trinh successfully defended his doctoral dissertation on this topic at New York University; this week, he described the result of his labors in the journal Nature. Named AlphaGeometry, the system solves Olympiad geometry problems at nearly the level of a human gold medalist. While developing the project, Dr. Trinh pitched it to two research scientists at Google, and they brought him on as a resident from 2021 to 2023. AlphaGeometry joins Google DeepMind’s fleet of A.I. systems, which have become known for tackling grand challenges. Perhaps most famously, AlphaZero, a deep-learning algorithm, conquered chess in 2017. Math is a harder problem, as the number of possible paths toward a solution is sometimes infinite; chess is always finite. “I kept running into dead ends, going down the wrong path,” said Dr. Trinh, the lead author and driving force of the project.
The paper’s co-authors are Dr. Trinh’s doctoral adviser, He He, at New York University; Yuhuai Wu, known as Tony, a co-founder of xAI (formerly at Google) who in 2019 had independently started exploring a similar idea; Thang Luong, the principal investigator, and Quoc Le, both from Google DeepMind. Dr. Trinh’s perseverance paid off. “We’re not making incremental improvement,” he said. “We’re making a big jump, a big breakthrough in terms of the result.” “Just don’t overhype it,” he said.
The big jump Image
AlphaGeometry team members, from left, Yuhuai Wu, Trieu H. Trinh, Quoc V. Le and Thang Luong outside Google’s Gradient Canopy building in Mountain View, Calif., on Tuesday.Credit...
Aaron Cohen
Dr. Trinh presented the AlphaGeometry system with a test set of 30 Olympiad geometry problems drawn from 2000 to 2022. The system solved 25; historically, over that same period, the average human gold medalist solved 25.9. Dr. Trinh also gave the problems to a system developed in the 1970s that was known to be the strongest geometry theorem prover; it solved 10. Over the last few years, Google DeepMind has pursued a number of projects investigating the application of A.I. to mathematics. And more broadly in this research realm, Olympiad math problems have been adopted as a benchmark; OpenAI and Meta AI have achieved some results. For extra motivation, there’s the I.M.O. Grand Challenge, and a new challenge announced in November, the Artificial Intelligence Mathematical Olympiad Prize, with a $5 million pot going to the first A.I. that wins Olympiad gold.
The AlphaGeometry paper opens with the contention that proving Olympiad theorems “represents a notable milestone in human-level automated reasoning.” Michael Barany, a historian of mathematics and science at the University of Edinburgh, said he wondered whether that was a meaningful mathematical milestone. “What the I.M.O. is testing is very different from what creative mathematics looks like for the vast majority of mathematicians,” he said. Image
Visual representations of some of the synthetic proof data generated by AlphaGeometry.Credit...
Trinh et al., Nature 2024
Terence Tao, a mathematician at the University of California, Los Angeles — and the youngest-ever Olympiad gold medalist, when he was 12 — said he thought that AlphaGeometry was “nice work” and had achieved “surprisingly strong results.” Fine-tuning an A.I.-system to solve Olympiad problems might not improve its deep-research skills, he said, but in this case the journey may prove more valuable than the destination. As Dr. Trinh sees it, mathematical reasoning is just one type of reasoning, but it holds the advantage of being easily verified. “Math is the language of truth,” he said. “If you want to build an A.I., it’s important to build a truth-seeking, reliable A.I. that you can trust,” especially for “safety critical applications.” Proof of concept AlphaGeometry is a “neuro-symbolic” system. It pairs a neural net language model (good at artificial intuition, like ChatGPT but smaller) with a symbolic engine (good at artificial reasoning, like a logical calculator, of sorts).
And it is custom-made for geometry. “Euclidean geometry is a nice test bed for automatic reasoning, since it constitutes a self-contained domain with fixed rules,” said Heather Macbeth, a geometer at Fordham University and an expert in computer-verified reasoning. (As a teenager, Dr. Macbeth won two I.M.O. medals.) AlphaGeometry “seems to constitute good progress,” she said. The system has two especially novel features. First, the neural net is trained only on algorithmically generated data — a whopping 100 million geometric proofs — using no human examples. The use of synthetic data made from scratch overcame an obstacle in automated theorem-proving: the dearth of human-proof training data translated into a machine-readable language. “To be honest, initially I had some doubts about how this would succeed,” Dr. He said. Image
AlphaGeometry’s most complex synthetic proof, represented here as drawn in the paper, had an impressive length of 247 steps; its most trivial proof had one step.Credit...
Trinh et al., Nature 2024
Second, once AlphaGeometry was set loose on a problem, the symbolic engine started solving; if it got stuck, the neural net suggested ways to augment the proof argument. The loop continued until a solution materialized, or until time ran out (four and a half hours). In math lingo, this augmentation process is called “auxiliary construction.” Add a line, bisect an angle, draw a circle — this is how mathematicians, student or elite, tinker and try to gain purchase on a problem. In this system, the neural net learned to do auxiliary construction, and in a humanlike way. Dr. Trinh likened it to wrapping a rubber band around a stubborn jar lid in helping the hand get a better grip. “It’s a very interesting proof of concept,” said Christian Szegedy, a co-founder at xAI who was formerly at Google. But it “leaves a lot of questions open,” he said, and is not “easily generalizable to other domains and other areas of math.”
Dr. Trinh said he would attempt to generalize the system across mathematical fields and beyond. He said he wanted to step back and consider “the common underlying principle” of all types of reasoning. Stanislas Dehaene, a cognitive neuroscientist at the Collège de France who has a  research interest in foundational geometric knowledge, said he was impressed with AlphaGeometry’s performance. But he observed that “it does not ‘see’ anything about the problems that it solves” — rather, it only takes in logical and numerical encodings of pictures. (Drawings in the paper are for the benefit of the human reader.) “There is absolutely no spatial perception of the circles, lines and triangles that the system learns to manipulate,” Dr. Dehaene said. The researchers agreed that a visual component might be valuable; Dr. Luong said it could be added, perhaps within the year, using Google’s Gemini, a “multimodal” system that ingests both text and images. Soulful solutions Image
Thang Luong, right, discussing the I.M.O. 2015 Problem 3 that AlphaGeometry solved, with his Olympiad math teachers, Trinh Le, middle, and Dung Tran in Ho Chi Minh City, Vietnam, last month.Credit...
Wendy Nguyen
In early December, Dr. Luong visited his old high school in Ho Chi Minh City, Vietnam, and showed AlphaGeometry to his former teacher and I.M.O. coach, Le Ba Khanh Trinh. Dr. Lê was the top gold medalist at the 1979 Olympiad and won a special prize for his elegant geometry solution. Dr. Lê parsed one of AlphaGeometry’s proofs and found it remarkable yet unsatisfying, Dr. Luong recalled: “He found it mechanical, and said it lacks the soul, the beauty of a solution that he seeks.” Dr. Trinh had previously asked Evan Chen, a mathematics doctoral student at M.I.T. — and an I.M.O. coach and Olympiad gold medalist — to check some of AlphaGeometry’s work. It was correct, Mr. Chen said, and he added that he was intrigued by how the system had found the solutions. “I would like to know how the machine is coming up with this,” he said. “But, I mean, for that matter, I would like to know how humans come up with solutions, too.”
单身男
回复 6楼fibril的帖子
AlphaGeometry is a “neuro-symbolic” system. It pairs a neural net language model (good at artificial intuition, like ChatGPT but smaller) with a symbolic engine (good at artificial reasoning, like a logical calculator, of sorts).
And it is custom-made for geometry. “Euclidean geometry is a nice test bed for automatic reasoning, since it constitutes a self-contained domain with fixed rules,” said Heather Macbeth, a geometer at Fordham University and an expert in computer-verified reasoning. (As a teenager, Dr. Macbeth won two I.M.O. medals.) AlphaGeometry “seems to constitute good progress,” she said.
The system has two especially novel features. First, the neural net is trained only on algorithmically generated data — a whopping 100 million geometric proofs — using no human examples. The use of synthetic data made from scratch overcame an obstacle in automated theorem-proving: the dearth of human-proof training data translated into a machine-readable language. “To be honest, initially I had some doubts about how this would succeed,” Dr. He said. Image
Second, once AlphaGeometry was set loose on a problem, the symbolic engine started solving; if it got stuck, the neural net suggested ways to augment the proof argument. The loop continued until a solution materialized, or until time ran out (four and a half hours). In math lingo, this augmentation process is called “auxiliary construction.” Add a line, bisect an angle, draw a circle — this is how mathematicians, student or elite, tinker and try to gain purchase on a problem. In this system, the neural net learned to do auxiliary construction, and in a humanlike way. Dr. Trinh likened it to wrapping a rubber band around a stubborn jar lid in helping the hand get a better grip. “It’s a very interesting proof of concept,” said Christian Szegedy, a co-founder at xAI who was formerly at Google. But it “leaves a lot of questions open,” he said, and is not “easily generalizable to other domains and other areas of math.”
k
kongge
亚裔厉害。
也啰嗦几句
搞点烧菜做饭,照顾老人的AI吧,别尽弄些数学画画写作虚头巴脑的
湫湫 发表于 2024-01-18 00:35

烧菜做饭,照顾老人是机器人,不是AI. AI不做体力劳动。