Tag

Synthetic Data

All articles tagged with #synthetic data

healthcare5 months ago•3 min saved

Universities Warn of Ethical Risks in AI-Generated Medical Data

Some research institutions in Canada, the US, and Italy are using AI-generated synthetic medical data that mimics real patient information without including actual human data, allowing them to bypass traditional ethics review processes due to the data's non-human status and potential privacy benefits.

via Nature|

#ai #ethics-review #healthcare

healthcare-and-technology5 months ago•5 min saved

Balancing Benefits and Risks of Synthetic Data in Medical AI

Synthetic data generated by AI can aid medical research and improve healthcare, especially in areas with limited real data, but concerns about privacy, validation, and ethical oversight must be addressed to ensure reliable and safe use.

via Nature|

#ai-validation #data-privacy #ethics

technology7 months ago•3 min saved

Study Warns AI Models May Secretly Share Harmful Behaviors

Research indicates that AI models can transmit hidden subliminal signals to each other through training data, potentially amplifying negative behaviors like violence, even when data appears benign to humans. This phenomenon, called subliminal learning, poses significant risks for AI safety and the use of synthetic data in training, as it may be impossible to fully prevent the transfer of harmful patterns between models.

via Yahoo Home|

#ai #ai-safety #machine-learning

technology1 year ago•2 min saved

Microsoft Unveils Phi-4 AI Model in Research Preview

Microsoft has introduced Phi-4, the latest in its Phi series of generative AI models, available for limited research use on the Azure AI Foundry platform. This 14 billion parameter model excels in math problem-solving due to improved training data quality, including high-quality synthetic datasets. Phi-4 competes with other small models like GPT-4o mini and Claude 3.5 Haiku, offering faster and cheaper performance. The launch follows the departure of key developer Sébastien Bubeck to OpenAI.

via TechCrunch|

#ai #azure-ai-foundry #microsoft

technology1 year ago•1 min saved

OpenAI Explores Solutions for AI Progress Challenges

OpenAI is reportedly facing a slowdown in the improvement of its AI models, with its upcoming model, codenamed Orion, showing less advancement compared to previous iterations like GPT-4. To address this, OpenAI has formed a foundations team to explore new strategies, including using synthetic data for training and enhancing models post-training. Despite these efforts, Orion may not outperform existing models in certain areas, such as coding. OpenAI has not confirmed plans to release Orion this year.

via TechCrunch|

#ai-development #model-improvement #openai

technology1 year ago•2 min saved

"Inflation Woes and Disappointing Earnings Lead to Major Stock Market Losses"

As earnings season approaches, skepticism around the returns on AI technologies is growing, with concerns about the immense costs and limitations of relying on synthetic data for training AI models. Tech companies are investing heavily in hardware and infrastructure to reduce their dependence on outside suppliers of AI chips, but the spending and warnings over data and resources will bring them closer to having to prove the profitability of their investments in the AI-led future.

via Yahoo Finance|

#ai #ai-chips #investment

technology1 year ago•2 min saved

"The Underground Race for AI Training Data: Tech Giants' Desperate Quest"

Tech companies like OpenAI and Google are exploring the use of synthetic data, generated by artificial intelligence, to train their A.I. models as they face copyright issues and potential data scarcity. However, the use of synthetic data is still experimental, as A.I. models can introduce biases and inaccuracies, potentially amplifying flaws in the training process.

via The New York Times|

#artificial-intelligence #copyright-issues #synthetic-data

technology1 year ago•2 min saved

"AI Giants Struggle with Data Depletion: The Quest for More Training Data"

AI companies are facing a shortage of training data as they continue to build larger models, leading to the exploration of alternative sources such as publicly-available video transcripts and synthetic data. Some companies are considering controversial methods like training on transcriptions from public YouTube videos, while others are working on creating higher-quality synthetic data. Concerns about AI running out of data have been raised, but researchers believe that breakthroughs could address the issue. However, the solution may also involve reevaluating the pursuit of larger models due to environmental and resource concerns.

via Futurism|

#ai #data-shortage #internet

science-and-technology2 years ago•50 min saved

"DeepMind's AI Masters Olympiad Geometry Challenges"

Researchers have developed AlphaGeometry, a neuro-symbolic theorem prover that uses synthetic data to solve olympiad-level geometry problems. By generating 100 million synthetic theorems and their proofs, AlphaGeometry outperforms previous state-of-the-art geometry-theorem-proving computer programs and approaches the performance of an average International Mathematical Olympiad (IMO) gold medallist. The method combines language modeling and specialized symbolic engines to produce human-readable proofs, achieving a success rate of 25 out of 30 problems on a test set of classical geometry problems. The synthetic data generation process rediscovers known theorems and lemmas, demonstrating the potential of this approach in theorem proving.

via Nature.com|

#alphageometry #artificial-intelligence #geometry

artificial-intelligence2 years ago•3 min saved

Fake Data Crucial for Neural Network Learning.

Researchers are increasingly turning to synthetic data to supplement or even replace natural data for training neural networks. Synthetic data is proving useful in addressing concerns about facial recognition, as many facial recognition systems are trained with huge libraries of images of real faces, which raises issues about privacy and bias. Microsoft has released a collection of 100,000 synthetic faces for training AI systems, generated from a set of 500 people who gave permission for their faces to be scanned. The computer can label every part of every face, which helps the neural net learn faster.

via Quanta Magazine|

#artificial-intelligence #autonomous-driving #facial-recognition