Amazon is developing smart glasses for delivery drivers that use AI and computer vision to provide real-time navigation, hazard detection, and proof of delivery, aiming to make last-mile deliveries safer and more efficient while offering a hands-free experience.
Apple is close to acquiring Prompt AI's team and technology, including its flagship product Seemour, a security camera AI. The deal involves an acquihire strategy, with Prompt's employees being informed and offered roles at Apple, while the Seemour app will be discontinued. This move aligns with a broader trend of tech giants acquiring AI talent and technology to enhance their capabilities while avoiding regulatory scrutiny.
Apple is nearing a deal to acquire Prompt AI, a startup specializing in computer vision technology, as part of its strategic expansion into AI and machine learning capabilities.
A researcher has developed a low-cost air quality estimation system using a microcontroller with a camera and a trained AI model to analyze sky images, offering a novel approach that complements traditional sensors, though with some limitations in accuracy.
Velo AI has developed Copilot, a Raspberry Pi-powered bike light equipped with AI and machine learning to detect cars, alert cyclists about approaching vehicles, issue warnings to drivers, and record incidents. Priced at $400, the device aims to enhance cyclist safety and is being used in a partnership with Pittsburgh to analyze data for potential road safety improvements.
Researchers have created the largest-ever dataset of biological images, containing over 10 million images of plants, animals, and fungi, and developed a new vision-based artificial intelligence tool called BioCLIP to learn from it. The tool, which outperformed existing models, can classify images and discern species across the tree of life, making it useful for a wide range of biological studies. The AI model's ability to learn fine-tuned representations of images and its potential for unraveling biological mysteries make it a valuable tool for biologists.
A study by MIT found that artificial intelligence is currently too expensive to effectively replace the majority of jobs, with only 23% of tasks being cost-effective to automate using AI-assisted visual recognition. The study, funded by the MIT-IBM Watson AI Lab, examined over 1,000 visually-assisted tasks across 800 occupations and concluded that while the adoption of AI in industries like retail and healthcare is feasible, it is less so in areas like construction and real estate. The researchers suggested that the cost-benefit ratio of AI could improve by 2030 if data costs fall and accuracy improves, but concerns about AI's impact on jobs persist, with industry leaders cautioning against a recklessly fast AI rollout.
Apple has released a research paper detailing its generative AI technology called HUGS, which can create a digital human avatar from a short video in about 30 minutes. The technique, known as Human Gaussian Splats (HUGS), uses 3D Gaussian Splatting to create an animatable human within a scene. The system can disentangle the static scene and create a fully-animatable human avatar, filling in gaps for elements like cloth and hair. The process is about 100 times faster than other methods and has been trained in collaboration with the Max Planck Institute for Intelligent Systems. Apple has been exploring the concept of digital avatars for various applications, including FaceTime conversations and representing users in virtual environments.
Alexei Efros, a computer scientist at the Berkeley Artificial Intelligence Research Lab, discusses the history of computer vision for AI, his breakthroughs in the field, and the impact of his work on everyday life. He explains how his poor eyesight proved advantageous and emphasizes the importance of addressing human bias in contemporary AI.
Former Google Nest engineers have developed a new robot vacuum called Matic, which uses visual processing instead of spatial mapping to navigate. Equipped with five RGB cameras, Matic can see in real-time and avoid common obstacles like high-pile rugs and cables. It operates locally without a cloud component, ensuring data privacy. Matic can vacuum and mop, and its computer vision allows it to identify different floor types and switch between cleaning modes. It can respond to gesture commands and autonomously seek out dirty areas. The vacuum is designed to be quiet, has a one-liter capacity bin, and can handle wet spills. Matic is available for pre-order at a discounted price of $1,495 and is set to be delivered in March 2024.
MIT engineers have developed a method that uses computer vision and machine learning to remotely evaluate the motor function of patients, specifically targeting those with cerebral palsy. By analyzing videos of patients in real-time and detecting patterns of poses, the method can assign a clinical score of motor function. The researchers tested the method on over 1,000 children with cerebral palsy and found that it matched with over 70% accuracy what a clinician had determined during an in-person visit. The team envisions that patients can use their mobile devices to record videos of themselves at home, which can then be analyzed and sent to a doctor for review. The method is also being adapted to evaluate other neurological disorders.
Researchers have used computer vision, a type of machine learning, to analyze nanoscale X-ray movies of lithium-ion battery electrodes and extract pixel-by-pixel information. This method has revealed physical and chemical details of battery cycling that were previously unseen. The study has already identified a way to improve the efficiency of lithium-ion batteries by controlling the thickness of the carbon coating on electrode particles. The findings have implications for battery design and could lead to the development of better batteries faster.
Hoani Bryson has created LazerPaw, an autonomous laser turret for cats to chase. The device uses computer vision and a Raspberry Pi to detect cats and move the laser away from them. It also includes an infrared camera for use in the dark, NeoPixels for added aesthetics, and a physical start button for offline use. The processed images are sent to a website for remote cat playtime.
Researchers from Chongqing University have developed a new system using deep learning models to enhance nanohole arrays and create specific structural colors. The models, CSC and CSS, accurately predicted the colors of the arrays, and the results were transformed into experimental reality. The scalability of this method shows promise for managing larger datasets and implementing complex structures in different materials. This research has implications for applications such as high-density storage and plasmic applications.
Researchers have developed SDS-Complete, a point cloud completion technique that leverages pre-trained text-to-image diffusion models to fill in missing parts. Traditional methods struggle with completing point clouds of objects not seen in the training set, but SDS-Complete combines prior knowledge from diffusion models with observed partial point clouds to generate accurate and realistic 3D shapes. The method utilizes the SDS loss and a Signed Distance Function (SDF) surface representation to ensure consistency with input points and preserve existing 3D content captured by different depth sensors.