oLLM is a lightweight Python library that enables large-context LLM inference on consumer GPUs by offloading weights and KV-cache to SSDs, maintaining high precision without quantization, and supporting models like Qwen3-Next-80B, GPT-OSS-20B, and Llama-3, making it feasible to run large models on 8 GB GPUs for offline tasks, though with lower throughput and storage demands.
OpenAI is reportedly close to releasing GPT-5 as early as August, after delays and development challenges, with the new model expected to unify various capabilities and advance towards artificial general intelligence.
A team of researchers from MIT, Cornell, and other institutions successfully trained a large language model using only ethically-sourced, publicly licensed data, challenging the industry belief that such development is impossible without vast resources. They created the Common Pile dataset, manually curated over eight terabytes of data, and trained a seven billion-parameter AI that rivals older industry models, highlighting ethical concerns around data use and copyright in AI development.
Microsoft is offering GPT-4-Turbo, the most powerful large language model from OpenAI, for free on its Copilot platform, replacing the previous GPT-4 version. This move aims to enhance user experience and keep Copilot integral to Microsoft's products and services. GPT-4-Turbo boasts a larger context window, up-to-date knowledge cut-off, and multimodal capabilities, making it a significant upgrade. The availability of this advanced AI tool for free may also hint at the imminent release of the next generation large language model from OpenAI.
Amazon is reportedly investing millions in training an ambitious large language model (LLM) codenamed "Olympus" with 2 trillion parameters, potentially making it one of the largest models being trained. Led by former head of Alexa, Rohit Prasad, the team aims to develop a model that can rival top models from OpenAI and Alphabet. Amazon believes that having homegrown models could enhance its offerings on Amazon Web Services (AWS) and attract enterprise clients seeking top-performing models. While there is no specific timeline for the release of the new model, Amazon has already trained smaller models and partnered with AI startups.
Amazon's generative AI leader, Rohit Prasad, describes the new large language model (LLM) powering Alexa as a "super agent" that is now integrated with thousands of devices and services. Prasad emphasizes that Alexa is not just a chatbot but a utility that performs useful tasks in the real world. He refutes criticisms that Alexa is "dumb" and highlights the personal context integration that makes the LLMs smarter. Prasad also addresses concerns about data privacy, stating that transparency and customer permission are paramount. While excited about Alexa's capabilities, he reminds users that Alexa is an AI and should not be forgotten as such.
Consulting firm EY has completed a $1.4 billion investment into artificial intelligence, unveiling its own large language model called EY.ai EYQ. The company plans to train its 400,000 employee workforce on AI and will focus future spending on refining its new AI platform.
Reliance Industries's Jio Platforms has partnered with Nvidia to develop a large language model trained on India's diverse languages. The collaboration aims to build an AI infrastructure that is significantly more powerful than India's fastest supercomputer, providing accelerated computing access to researchers, developers, startups, and AI experts in India. Nvidia will equip Jio with AI supercomputer solutions, while Jio will manage the AI cloud infrastructure. India, despite its population, has yet to make a significant impact in the global AI arena, and this partnership seeks to change that. Reliance Industries, known for its oil business, has been diversifying into various sectors, including telecom and video streaming, and Jio Platforms is positioning itself as a technology distribution partner for global giants.
Open source developers may win the generative AI market battle, as recent progress in the community has made Generative AI within reach of any AI-savvy developer. The recent leak of Meta's Large Language Model Meta AI model (LLaMA) spurred an avalanche of innovation from the open source community, with enhancements such as instruction tuning, quantization, quality improvements, and others developed in quick succession. The use of a cheap fine-tuning mechanism known as low-rank adaptation (LoRA) has reduced the barrier to entry for training and experimentation significantly, enabling individuals to personalize a language model in a few hours on consumer hardware. The rapid innovation, combined with the lack of usage restrictions, makes open source AI models an attractive alternative for many users.
The open-source AI debate is heating up in Big Tech, with recent headlines from Google and Meta. Google's newest large language model (LLM) PaLM 2 uses nearly five times more text data for training than its predecessor, but the company has been unwilling to publish the size or other details of its training data. Meanwhile, Meta's chief AI scientist Yann LeCun argues that the growing secrecy at Google and OpenAI is a "huge mistake" and a "really bad take on what is happening." However, Meta also believes that some levels of openness go too far, and accountability and transparency in AI models are essential.
OpenAI and Google may face a threat from rapidly multiplying open source projects that push the state of the art and leave the deep-pocketed but unwieldy corporations in their dust. The head start they’ve gained with funding and infrastructure is looking slimmer by the day. The business paradigm being pursued by OpenAI and others right now is a direct descendant of the SaaS model. Google should establish itself as a leader in the open source community, taking the lead by cooperating with, rather than ignoring, the broader conversation.
OpenAI has released GPT-4, an upgrade to its large language model technology that can generate longer strings of text and respond to images. GPT-4 is designed to avoid AI pitfalls and is more reliable, creative, and able to handle nuanced instructions than its predecessor, GPT-3.5. It can also accept input data that includes text and photos and has better performance in avoiding AI problems like hallucinations. Microsoft is using GPT-4 for its Bing search engine, posing a major search threat to Google, which has its own large language model technology.