Voice Emotion Recognition (VER) is a dynamic and has implications on a wide range of research areas. Use of a computer for voice emotion recognition is a way to study the voice signal of a speaker, as well as is a process that is altered by inner emotions. Human Machine Interface (HMI) is very vital and opted to implement this effectively and an innovative way. To develop new recognition methods, this research paper evaluates the basic emotions of human. Accurate detection of emotional states can be further used as a machine learning database for interdisciplinary experiments. The proposed system is an algorithmic method that first extracts the audio signal from the microphone, preprocesses it, and then evaluates the parameters based on various characteristics. The model is trained through the Mel Frequency Cepstral Coefficient (MFCC) and PRAAT (Speech Analysis in Phonetics) coefficients. By creating a feature map using these, Convolutional Neural Networks (CNN) effectively learn and classify the attributes of perceived signals of basic emotions such as sadness, surprise, happiness, anger, fear, neutral and disgust. The proposed method provides good recognition rate. © 2021 IEEE.