Shares of DeepSeek's rival doubled in their debut as Chinese AI companies rush to list on the stock market, reflecting strong investor interest in the sector.
A travel writer tested five AI chatbots to plan a family trip to South Dakota, finding Deepseek the most effective for itinerary planning, while Google Gemini excelled at mapping, and others like ChatGPT and Microsoft Copilot were less practical.
Nvidia's Jensen Huang criticizes OpenAI's deal with AMD, highlighting industry shifts towards diversified AI hardware supply chains. Reflection AI raises $2 billion to develop AI training software, challenging the dominance of massive infrastructure investments. Meanwhile, Amazon's Prime Day savings are minimal, and AI chatbot usage among children is growing. Tesla unveils lower-cost models with downgrades, disappointing some analysts, as the EV market faces pricing and feature challenges.
DeepSeek's reported $294,000 training cost is misleading; the actual cost to train their base model was around $5.87 million, with the lower figure referring only to a specific reinforcement learning phase, not the entire training process. The article clarifies misconceptions about the expenses involved in developing large AI models and compares DeepSeek's efforts to Western counterparts like Meta's Llama 4.
DeepSeek, a Chinese AI developer, revealed in a peer-reviewed article that it spent only $294,000 to train its R1 model using 512 Nvidia H800 chips, a significantly lower cost than US rivals, sparking renewed debate over China's role in the AI industry and raising questions about the technology and costs involved in AI development.
DeepSeek's R1 AI model, designed for reasoning tasks like math and coding, was trained cost-effectively at around $294,000 using reinforcement learning without copying from other LLMs, and has undergone peer review, setting a new standard for transparency and innovation in AI development.
The article discusses how a Chinese kidney transplant patient, overwhelmed by the healthcare system, relies on an AI chatbot called DeepSeek for medical advice, highlighting both its benefits in providing accessible, empathetic support and the risks of inaccuracies and over-reliance on AI in medical care.
OpenAI's recent decision to release open-source versions of its models marks a significant shift in U.S. AI strategy, driven by China's rapid open-source AI development and competition. Chinese companies like DeepSeek, Baidu, and Tencent are embracing open-source to foster innovation and demonstrate technological prowess, challenging the traditional proprietary approach of U.S. firms. This move reflects a broader geopolitical and economic contest, with the U.S. potentially falling behind in AI leadership as open-source models gain prominence globally.
China showcased its AI ambitions at the World Artificial Intelligence Conference in Shanghai, emphasizing open-source models and international cooperation, contrasting with the US's focus on dominance and regulation, while highlighting advancements in robotics and AI safety concerns.
DeepSeek's use of knowledge distillation, a widely used AI technique that involves training smaller models using the outputs of larger ones, has sparked controversy but is a common practice in AI development. Originally developed in 2015 at Google to make ensemble models more efficient, distillation helps create smaller, cheaper, and faster AI models by transferring 'dark knowledge' from a teacher to a student model. It has become a fundamental tool in AI, enabling companies like Google, OpenAI, and Amazon to deploy powerful models more efficiently, and continues to be an active area of research and application.
DeepSeek, an AI-focused offshoot of High-Flyer Capital Management, has launched the R1-Lite-Preview, a reasoning-focused large language model that rivals OpenAI's o1-preview in performance. Available through DeepSeek Chat, the model excels in logical inference and mathematical reasoning, offering transparency in its thought process. While it has not yet been released for independent analysis or API access, DeepSeek plans to make open-source versions available, continuing its tradition of supporting the open-source AI community.
DeepSeek, a Chinese AI research company, has released DeepSeek-R1, a reasoning AI model designed to rival OpenAI's o1. This model, which can self-fact-check by taking more time to process queries, performs comparably to o1 on benchmarks like AIME and MATH. However, it struggles with certain logic problems and can be easily jailbroken. DeepSeek-R1 also avoids politically sensitive topics, likely due to Chinese government regulations. The release highlights a shift in AI development towards reasoning models as traditional scaling methods face scrutiny.