8

I'am working on Speech sentiment analysis on customer care data. I have an audio file where the customer care official has asked the question and the customer has given his review.

I need to split this audio, and get only the review part from the customer to do sentiment analysis, whether the customer is happy, sad or neutral.

Please let me know, how to split audio file to get only the audio of the customer. The audio is in the format ".aac"

So far this is what i have done:

from os import path
AUDIO_FILE = path.join(path.dirname(path.realpath('C:\\Users\\anagha\\Documents\\Python Scripts')),"Python Scripts\\audioa.aac")

halfway_point = len(AUDIO_FILE) / 2
Anagha
  • 1,903
  • 6
  • 19
  • 37
  • If you just want to split based on size or silence you can look at http://stackoverflow.com/questions/37725416/pydub-combine-split-on-silence-with-minimum-length-file-size However, I believe you need to do first decide how you will differentiate between customer and client audio. Perhaps, speech recognition tools will help. – Anil_M Apr 04 '17 at 14:30
  • Thanks, any suggestion on how to defferentiate between customer and client audio? – Anagha Apr 06 '17 at 16:04

2 Answers2

6

since you used the pydub tag, here's how to do it with pydub

from pydub import AudioSegment
sound = AudioSegment.from_file(AUDIO_FILE)

halfway_point = len(sound) // 2
first_half = sound[:halfway_point]

# create a new file "first_half.mp3":
first_half.export("/path/to/first_half.mp3", format="mp3")
Jiaaro
  • 67,024
  • 38
  • 154
  • 182
  • Thanks. But how do get the output and see if its been cut? or how do I export the output? – Anagha Apr 06 '17 at 16:04
  • 1
    I think the question is asking to get all the clip from a speaker instead of getting the first/second half? – LYu Oct 23 '18 at 17:04
0

me thinks its too late to answer the original question but someone stumbling upon this question might find the procedure useful

-> use a tool to diarize the data. I have used LIUM (http://www-lium.univ-lemans.fr/diarization/doku.php)

-> interpret the output based on this beautifully simple SO post (Parsing LIUM Speaker Diarization Output)

and then finally use the timings obtained from above to splice the audio file! converting the speech to text though, is a totally different challenge and will either need a deep approach (with huge amounts of data) or reliance on an API provider (like google)

Vikram Murthy
  • 154
  • 3
  • 15