Tag

Speech Recognition

All articles tagged with #speech recognition

AI Headphones Isolate Single Voice in Crowds with a Glance
technology1 year ago

AI Headphones Isolate Single Voice in Crowds with a Glance

Researchers at the University of Washington have developed AI-powered headphones that allow users to focus on a single speaker in a noisy environment by looking at them for a few seconds. The system, called "Target Speech Hearing," uses binaural microphones and machine learning to isolate and amplify the enrolled speaker's voice in real time, even as the listener moves around. The technology was presented at the ACM CHI Conference and is not yet commercially available.

"Save 74% on Lifetime Babbel Subscription and Learn a New Language in 2024"
language-learning2 years ago

"Save 74% on Lifetime Babbel Subscription and Learn a New Language in 2024"

StackSocial is offering a 74% discount on a lifetime Babbel subscription, bringing the price down to $150 until Jan. 10. Babbel's language programs cover 14 languages with short, practical lessons and speech-recognition technology for pronunciation feedback. The program is accessible across devices and offers an offline mode for learning on the go. This deal is ideal for new users looking to invest in language learning for personal or professional growth.

Scientists Create Hybrid Biochip for Speech Recognition Using Human Brain Cells
technology2 years ago

Scientists Create Hybrid Biochip for Speech Recognition Using Human Brain Cells

Brain organoids, clusters of human brain cells grown in a dish, have been successfully connected to an electronic chip and used to carry out simple computational tasks, including rudimentary speech recognition. Researchers at Indiana University Bloomington developed a brain organoid from stem cells and connected it to an AI tool through their setup called Brainoware. The hybrid system demonstrated the ability to process, learn, and remember information, although with lower accuracy compared to artificial neural networks. While the study showcases the potential of brain organoids for biocomputing, challenges remain in terms of long-term information processing and learning capabilities, as well as the complex task of generating and maintaining brain cell cultures.

Advancements in Brain Tissue Integration Revolutionize Speech Recognition and Machine Learning
technology2 years ago

Advancements in Brain Tissue Integration Revolutionize Speech Recognition and Machine Learning

Researchers have developed a hybrid system called Brainoware, which combines human brain cell networks (organoids) with a computer chip, demonstrating capabilities in processing, learning, and memory. The system achieved basic speech recognition skills by decoding audio clips of Japanese vowels, improving its accuracy to about 78% with training. While the system is less accurate than artificial neural networks, this research opens new possibilities in biocomputing and showcases the potential of brain organoids in complex computational tasks.

"Smartphones and Smart Speakers: Detecting Drunkenness through Speech Analysis"
technology2 years ago

"Smartphones and Smart Speakers: Detecting Drunkenness through Speech Analysis"

Researchers from Stanford University and the University of Toronto have developed an algorithmic method to identify alcohol intoxication with 98% accuracy by analyzing speech patterns. Participants in the study were served vodka gimlets and asked to read tongue-twisters every hour for seven hours. The speech samples were analyzed using an algorithm that examined spectral and frequency-based voice features. While the results are promising, the study is still in the proof-of-concept stage and requires further research and validation. Privacy concerns and public acceptance of such technology also need to be addressed before it can be implemented in real-world scenarios.

Breakthrough Brain-Computer Interface Enables Paralyzed Woman to Speak through Digital Avatar
technology2 years ago

Breakthrough Brain-Computer Interface Enables Paralyzed Woman to Speak through Digital Avatar

Researchers at UC San Francisco and UC Berkeley have developed a Brain-Computer Interface (BCI) that allows a woman with severe paralysis from a brainstem stroke to speak through a digital avatar. The BCI uses electrodes implanted on the woman's brain to capture and decode her brain signals, enabling her to communicate. The researchers trained an AI model to recognize and encode her words, improving the accuracy and speed of the system. They also developed a machine learning model to merge the digital avatar with the woman's brain signals for facial recognition. The researchers are working on a wireless version of the BCI for future improvements.

The Challenge of Spotting Deepfake Speech: Humans Struggle to Detect AI-Generated Audio
technology2 years ago

The Challenge of Spotting Deepfake Speech: Humans Struggle to Detect AI-Generated Audio

Humans struggle to detect deepfake speech, with researchers finding that people can only differentiate between real and deepfake speech 73% of the time. The latest algorithms can recreate a person's voice using just a three-second clip, raising concerns about the potential for deepfake technology to be used for criminal activities. While there are methods to spot deepfakes, such as unnatural eye movement, facial expressions, and body posture, the sophistication of deepfake technology poses challenges for detection. Governments and organizations are urged to develop strategies to address the potential abuse of deepfake tools while recognizing the positive possibilities they offer.

Unveiling the Brain's Word Processing Secrets
neuroscience2 years ago

Unveiling the Brain's Word Processing Secrets

Researchers have discovered that the brain's auditory lexicon, which catalogs verbal language, is located in the front of the primary auditory cortex, contrary to previous beliefs. This finding challenges long-held assumptions about brain organization and could have significant implications for recovery and rehabilitation following brain injuries. The study used functional magnetic resonance imaging (fMRI) to investigate the role of the Auditory Word Form Area (AWFA) in spoken word processing. The findings could lead to new strategies for understanding and addressing speech comprehension deficits, particularly in stroke or brain injury patients.

Apple finally fixes 'ducking' autocorrect issue in iOS update.
technology2 years ago

Apple finally fixes 'ducking' autocorrect issue in iOS update.

Apple has improved its autocorrect feature in iOS 17 by adding a transformer learning model that better understands what users mean by studying sentences as a whole. Autocorrected words will be underlined with the ability to revert them back, and predictive text will show suggestions inline. Speech recognition is also being improved with a transformer model. All of these features are done on-device for privacy. The non-developer public beta for iOS 17 is scheduled to begin next month.

Revolutionizing Virtual Characters with NVIDIA ACE and Generative AI.
gaming-ai2 years ago

Revolutionizing Virtual Characters with NVIDIA ACE and Generative AI.

NVIDIA has announced the Avatar Cloud Engine (ACE) for Games, a custom AI model foundry service that brings intelligence to non-playable characters (NPCs) through AI-powered natural language interactions. Developers can use ACE for Games to build and deploy customized speech, conversation, and animation AI models in their software and games. ACE for Games delivers optimized AI foundation models for speech, conversation, and character animation, including NVIDIA NeMo, Riva, and Omniverse Audio2Face. The neural networks enabling NVIDIA ACE for Games are optimized for different capabilities, with various size, performance, and quality trade-offs.

"Advancements in AI Speech Recognition and Language Models"
ai2 years ago

"Advancements in AI Speech Recognition and Language Models"

Meta has developed an open-source AI language model called Massively Multilingual Speech (MMS) that can recognize over 4,000 spoken languages and produce speech in over 1,100. MMS was trained on audio recordings of translated religious texts, which increased the model's available languages. Meta hopes that MMS will help preserve language diversity and encourage researchers to build on its foundation. The company cautions that the models aren't perfect and collaboration across the AI community is critical to the responsible development of AI technologies.

Chat with an AI-powered speaker that's also a chatbot.
ai2 years ago

Chat with an AI-powered speaker that's also a chatbot.

A developer has created a standalone voice-operated ChatGPT client that uses a USB speaker, Raspberry Pi, Teensy, two-line LCD, and a big red button. The Pi listens to speech and converts it to text using OpenAI voice transcription, sends it to ChatGPT through its API, and turns the response into sound through the eSpeak speech synthesizer. The LCD shows the machine's status and provides live subtitles while the machine is talking. The AI box also has an LED ring that shows a spectrogram of the audio being generated. All code is available on GitHub.

The Terrifying AI-Powered Furby: A Nightmare Come to Life
technology2 years ago

The Terrifying AI-Powered Furby: A Nightmare Come to Life

A software engineer created an AI-powered Furby using a Raspberry Pi, Python’s Speech Recognition Library, and OpenAI’s Whisper Library. The skinless Furby responds to questions and has a plan to take over the world. The creator, Jessica Card, did not have the heart to cut up her original Furby and instead purchased multiple Furbies on eBay. Card plans to continue working on the project and hopes to isolate the movements and put the skin back on.