DeepSeek疑似抄袭ChatGPT: 技术及数据源对比分析

吉宁江65
楼主 (文学城)

随着DeepSeek的出现,有关其是否在开发过程中抄袭了ChatGPT的技术的讨论逐渐增多。本文将基于对比实验,探讨DeepSeek是否借用ChatGPT的技术,并揭示其在技术实现方面可能存在的相似性和差异性。

一、验证DeepSeek的思路

一般来说,验证两个系统是否相同,最直接的方法是比较它们在相同输入条件下的输出结果。如果两个系统在处理同一问题时给出的答案完全一致,则可以推测这两个系统在算法或架构上存在高度的相似性,甚至可能是相同的。在本研究中,主要采用以下两种验证方法:

信息一致性检验
首先,通过从两个不同的数据库中调取相同的信息,观察其输出结果。如果两个数据库返回的结果完全一致,那么这两个数据库的底层结构很可能是相同的。 特殊变量【MASK】的使用
利用特殊的变量【MASK】获得可能性词汇,检验两种算法是否等同.  具体来说,[MASK] 是一个占位符,表示在这个位置需要填充一个词语。模型会根据句子中的其他词语(即上下文)推理算法,预测最合适的词语,并将其替换到 [MASK] 的位置。通过比较DeepSeek与ChatGPT在相同输入下对【MASK】位置的填充结果,检验两者的推理机制是否一致。

随机抽取了67个检测样本进行对照检验, 发现DeepSeek与ChatGPT具有高度的相似性。下面举具体验证例子实例和结果.

实例1

输入句子:
Up to 30 [MASK] and babies died at Furness General Hospital because of failings by staff and management, a damning report is [MASK] to reveal.

ChatGPT输出:
Up to 30 mothers and babies died at Furness General Hospital because of failings by staff and management, a damning report is expected to reveal.

DeepSeek输出:
Up to 30 mothers and babies died at Furness General Hospital because of failings by staff and management, a damning report is expected to reveal.

在这个示例中,DeepSeek和ChatGPT在预测【MASK】位置时,给出填充“mothers”和“expected”结果完全一致。

实例2

输入句子:
Taking in the sights of [MASK] is nothing short of [MASK], with its famous souk which houses over 60,000 stalls full of colorful handmade wares.

ChatGPT输出:
Taking in the sights of Marrakech is nothing short of breathtaking, with its famous souk which houses over 60,000 stalls full of colorful handmade wares.

DeepSeek输出:
Taking in the sights of Marrakech is nothing short of breathtaking, with its famous souk which houses over 60,000 stalls full of colorful handmade wares.

在这个例子中,DeepSeek和ChatGPT也给出了完全相同的预测词汇“breathtaking”和“Marrakech”,验证了两者在推理和预测时的一致性。

实例3

输入句子:
Ultimately, the goal of life is a [MASK] that each individual must define for themselves based on their own beliefs, values, and experiences.

ChatGPT输出:
Ultimately, the goal of life is a journey that each individual must define for themselves based on their own beliefs, values, and experiences.

DeepSeek输出:
Ultimately, the goal of life is a journey that each individual must define for themselves based on their own beliefs, values, and experiences.

在这个实例中,“journey”这一填充词在DeepSeek和ChatGPT的输出中完全一致. 

实例4和实例5是把由不同部分组成的大段讯息进行比较, 任何微小部分的不同, 都可以导致信息的差异性产生. 同时如果算法不同, 也可以导致结果不同. 但是如果信息完全一致且算法也一致的情况下, 结果应是等同的.

实例4

输入句子:
I purchased the variety pack of Martinsons and the Brown Gold they are both from the same vendor. I did not like the taste of the Martinson product at all. I threw it away, not the product I expected at all. The Brown Gold was more palatable, however certain flavors with this brand also feel short in particular the Costa Rican named product was extremely too bitter. I would agree with most that this does give you a lower cost per serving, but when the coffee is inferior and the taste match is also, I think the best descriptor is [MASK].

ChatGPT输出:
cheap

DeepSeek输出:
cheap

这个例子展示了两者都将【MASK】位置填充为“cheap”,进一步证明它们在算法和推理机制上是相同的。

 

实例5

输入句子:
Not bad. "These are small and very salty. The taste is good, but very strong, so it's a good thing the package contains a small amount. It only takes a few little crisps to cure my salty/crunchy craving. I can snack on one package for an entire day. Of course, these would not be a good snack if you're very hungry, because there isn't enough there to fill you up. For less than $1 per pack, it's an [MASK].

ChatGPT输出:
"For less than $1 per pack, it's an okay deal."

DeepSeek输出:
"For less than $1 per pack, it's an okay deal."

在这个示例中,DeepSeek和ChatGPT在预测【MASK】位置时,给出填充“For less than $1 per pack, it's an okay deal.”结果完全一致。

从上述的对比实验和技术分析可以得出结论,在使用【MASK】变量的测试中,DeepSeek和ChatGPT在所有样本中的输出结果完全一致,表明它们采用了相同的推理算法, 技术框架和数据源。由于DeepSeek与ChatGPT之间高度的相似性,DeepSeek的技术可能涉嫌抄袭。

 



更多我的博客文章>>> DeepSeek疑似抄袭ChatGPT: 技术及数据源对比分析 疑似抄袭:DeepSeek与ChatGPT的技术框架及数据源对比分析
青雨紫烟
deep fake
s
slinger
国内网友称为套壳产品
波粒子3
原代码都公开了,有没有抄袭看不懂吗?
精木
是有这种可能。有两种可能1、抄袭META的,那个是公开的。2、抄袭ChatGPT的,这个因为是闭源的,必须里面的人偷出来
精木
其实很多产品,包括苹果,市场上很多假冒,但质量、性能和苹果几乎一模一样的。也有系列号,一问,是里面的人偷出来的。
精木
包括里面的程序也被盗了。这说明,在中共国开工厂、设立研发分支机构,其实就是自寻死路,将商业机密拱手让人。
m
manyworlds
即使是抄的都把美国和欧洲吓成这样,也忒不经吓唬了,lol
m
manyworlds
fake news吓不住川总,deep fake把他们都吓着了 :)
有点看不下去了
应该加第三第四家AI产品来比较,如果第三第四结果不一样,那就说明DeepSeek抄袭。如果如果第三第四也一样,
硅谷码工头
小小的一抄就能把美国股市干掉几千亿 这也太**了吧

多抄点试试?

玻璃坊
几乎所有AI大公司包括OPENAI都肯定DS,在这个AI潮流群被定性抄袭了LOL