A dialogue with chatGPT about Deepseek.

方外居士
楼主 (文学城)

QQ: Who are you?

 

ChatGPT: I'm ChatGPT, an Al designed to assist with answering questions, brainstorming ideas, and having conversations. What's on your mind?

 

QQ: How do you think about deepseek? Does it copy you in any sense?

$

ChatGPT: DeepSeek is another Al model developed independently, and while it may share similarities with me in terms of being a large language model, it isn't a direct copy. Many modern Al models are built using similar underlying techniques, such as transformer architectures and large-scale training on diverse datasets, but the implementation details, training data, and fine-tuning oaches can vary sianificantlv between models.

 

ChatGPT: Do you have any specific concerns or comparisons in mind?

 

QQ: Want to know some technique detail comparison of deepseek and you.

 

 ChatGPT: Certainly! Let's delve into a technical comparison between DeepSeek and ChatGPT, focusing on their

architectures, training methodologies, and performance characteristics.

 

Architecture and Model Design

??ChatGPT: Developed by OpenAl, ChatGPT is based on the GPT (Generative Pre-trained Transformer)
architecture. It utilizes a dense transformer model where all parameters are active during inference, leading to substantial computational resource consumption. ??DeepSeek: DeepSeek employs a
"Mixture-of-Experts" (MoE)
architecture, which activates only a subset of its parameters during inference. This design enhances computational efficiency by engaging only the necessary "experts" for a given task, reducing resource usage and improving inference speed.

 

Training Methodology

??ChatGPT: The training process involves large-scale datasets comprising internet text, followed by supervised fine-tuning and reinforcement learning from human feedback to align the model's outputs with human preferences. ??DeepSeek: DeepSeek's training strategy includes pretraining on a diverse multilingual corpus, with a higher emphasis on mathematical and programming data. The model undergoes supervised fine-tuning using both human-generated and model-generated data, particularly in reasoning domains. Additionally, DeepSeek employs reinforcement learning techniques to further refine its performance.

 

Performance and Efficiency

??ChatGPT: While delivering high-quality responses, ChatGPT's dense architecture results in higher computational costs during inference, which can impact scalability and energy efficiency. ??DeepSeek: The MoE architecture of DeepSeek allows for significant computational savings, as only relevant experts are activated per task. This leads to reduced energy consumption and lower operational costs. Notably, DeepSeek has achieved performance on par with leading models like OpenAl's GPT series, but with a fraction of the training and inference costs.

 

Open-Source Availability

??ChatGPT: OpenAl's models, including ChatGPT, are proprietary, with access provided through APls and specific platforms. ??DeepSeek: In contrast, DeepSeek has embraced an open-source approach, releasing models like DeepSeek-R1 to the public. This openness fosters collaborative innovation and allows for broader adoption and adaptation within the Al community.

 

Conclusion

While both DeepSeek and ChatGPT are advanced language models excelling in natural language understanding and generation, they differ in architectural design and training methodologies.

DeepSeek's MoE architecture offers a more resource-efficient alternative to the dense transformer model employed by ChatGPT, potentially democratizing access to advanced Al capabilities through its open-source model.

 

DeepSeek's Impact on the Al Industry

w wired.com

How Chinese Al Startup DeepSeek Made a Model that Rivals OpenAl

Jan 25, 2025

/theverge.com

Why everyone is freaking out about DeepSeek

Jan 28, 2025

© theguardian.com

Tuesday briefing: How an unknown Chinese startup wiped $593bn from the value of an Al giant

Jan 27, 2025

最西边的岛上
方兄早,新春快乐!haven't usedDSyet, but DS news seem to change daily
最西边的岛上
U probably have seen discussions in 投坛&几坛. 2me, DS's bias is
最西边的岛上
a killer, if what others found r true. notSure SW piracy now
方外居士
方外居士
Microsoft makes Deepseek’s R1 model available on Azure AI a
最西边的岛上
just checked, mfst stock dropped 6+% today so far. Wow!
方外居士
Microsoft and Meta Have a DeepSeek Strategy: Copy and Surpas
暖冬cool夏
potentially democratizing access to advanced Al capabilities
暖冬cool夏
说得好!我昨晚也下载了Deepseek!
妖妖灵
DS wiped $593bn. 所谓不战而屈人之兵。
妖妖灵
我还没下载。什么时候用呢?
暖冬cool夏
随时可以用啊,有问题啊,不确定时:)就像ChatGPT这样用吧,我想。妖mm新春快乐!
方外居士
若真如此,这个泡沫够大的
暮子默
I downloaded DS, but lately it's been pretty slow.

I heard it was targeted by malicious cyberattacks. 

BTW, interesting findings! Thank you for sharing:-)