Text Search on Speech and Sentiment Analysis on Transcript

Bhishan Bhandari
2017
BSc.CSIT
Semester 7
Downloads 5

The following document discusses on text search on audio in addition to sentiment analysis on the transcript of the audio. The audio is initially sampled at the rate of 44100 hz. The sampled audio is translated to transcript using the IBM Watson API. Audio timestamp mapping to text is done which enables text search to be performed on this audio. Additionally, sentiment analysis of the transcript is performed using Random Forest Algorithm. In order to train the model, 514,999 classified tweets, each row is marked as 1 for positive sentiment and 0 for negative sentiment was used. 69% of accuracy was obtained on training the model with 70% of data and testing with 30% of the dataset.

Random Forest
Bag of Words
Sentiment Analysis
Text Search

Similar Projects