DeepLearning.AI’s Post

How DeepSeek uses a Mixture of Experts architecture and other training techniques to outperform more expensive models 👇 https://hubs.la/Q0347b7q0

DeepSeek-V3 Redefines LLM Performance and Cost Efficiency

DeepSeek-V3 Redefines LLM Performance and Cost Efficiency

deeplearning.ai

But it has copied or maybe trained on ChatGPT data. I am not sure if this is a bad copy/paste of a DeepSeek engineer or just a anomaly from same training data, but it actually provides references to OpenAI/ChatGPT documentation for something it should just answer it self or refer to its own documentation. I asked it to provide instructions for downloading the chat conversation, and one of the options literally has ChatGPT written in it and provides link to OpenAI documentation. So I am a bit skeptical of DeepSeek.

  • No alternative text description for this image

DeepSeek is really pushing the boundaries with Mixture of Experts! Exciting to see how these techniques make models more efficient. 

Like
Reply
Lindsay Richman

Founder, Innerverse AI | McKinsey Alum | Google for Startups | VentureBeat Top Woman in AI

1d

DeepLearning.AI I'm sorry, but there's no way that something like this will replace tool use. Why is that? Because with tool use, especially with memory management in the loop, you can extend the initial training data significantly. I have no idea how you would deal with the complexity of non-linearity otherwise. This is absolutely relevant in engineering, and I've actually advocate for deepseek to have tool use because it often will say that it needs to research things and is not able to do it without access to search and scrape or extract capabilities. You can literally give this model echolocation type abilities with something like Firecrawl. This also calls into question your interpretation of open versus closed models. I realize that you mean open as open source, but it might actually be more useful to think of open as something that has tools and can actually access data beyond its initial training set in real time.

R O.

Lead Full Stack Software Developer | Cloud Computing | AWS Certified Solutions Architect in Progress | Innovating with Quantum Computing & AI | Creator of the New Balance Shoe Finder

2d
Like
Reply
Himanshu Sharma

AI Evangelist | Digital Transformation Architect | Product Leader

1d

Very informative

Like
Reply
David K.

🚀 LLMs & NLP Innovator | AI & Big Data Engineering Leader | Python Back-end Expert | 15+ Years in Tech | Speaker & Mentor

2d

Love this

Like
Reply
See more comments

To view or add a comment, sign in

Explore topics