Skip to content

The study presents a method to detect emotional states in real-time using audio data. It preprocesses the audio and trains a CNN to recognize emotions, achieving high accuracy. The system has potential applications in interactive systems and mental health monitoring, contributing to the development of emotionally intelligent technology.

Notifications You must be signed in to change notification settings

Deepak-Tetame/Human_Voice_Emotion_Detection_NLP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

🎙️ Human Voice Emotion Prediction System

🔍 Overview

This project is a deep learning-based human voice emotion prediction system that analyzes human speech samples to predict emotions. The system converts audio inputs into Mel spectrograms, extracts key features using a Convolutional Neural Network (CNN), and compares them with a dataset to classify emotions accurately.

🚀 Features

🎤 Speech-to-Emotion Analysis: Predicts emotions from voice samples. 🖼️ Mel Spectrogram Conversion: Converts audio files into spectrogram images for feature extraction. 🧠 CNN-Based Model: Uses Convolutional Neural Networks to extract features and classify emotions. 📊 Multi-Class Emotion Detection: Supports multiple emotions such as Happy, Sad, Angry, Neutral, etc. 📂 Pretrained & Custom Dataset Support: Works with publicly available emotional speech datasets. 🔍 Data Preprocessing: Includes noise removal, resampling, and normalization. 📈 Performance Evaluation: Model tested using accuracy, confusion matrix, and loss/accuracy plots.

🏗️ Tech Stack & Tools

Python 🐍 TensorFlow/Keras 🧠 Librosa 🎵 (for audio processing) Matplotlib & Seaborn 📊 (for visualization) Scikit-learn 🛠️ (for data preprocessing & evaluation) NumPy & Pandas 📑 (for data handling) Jupyter Notebook/Google Colab 💻 (for model training & testing)

📁 Dataset

Used the RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song) dataset. Supports other emotion recognition datasets such as TESS, EMO-DB, CREMA-D.

📊 Model Performance

Training Accuracy: ~85-90% Validation Accuracy: ~80-85% Confusion Matrix Analysis: Shows strong classification between different emotions.

📌 Future Improvements

✅ Improve model generalization with more diverse datasets. ✅ Experiment with LSTM + CNN hybrid models for better sequential feature extraction. ✅ Deploy as a web application using Flask/Django. ✅ Integrate real-time emotion recognition via microphone input.

🤝 Let's Connect!

💼 Portfolio: https://www.linkedin.com/in/deepak-tetame-198932211 📧 Email: tetamedeepak@gmail.com 🐙 GitHub: github.com/Deepak-Tetame

About

The study presents a method to detect emotional states in real-time using audio data. It preprocesses the audio and trains a CNN to recognize emotions, achieving high accuracy. The system has potential applications in interactive systems and mental health monitoring, contributing to the development of emotionally intelligent technology.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages