0

For my work I need to analyze large .wav files (>208 MB), and I make use of the R packages seewave and tuneR. I bring each file into the R environment in 30 s chunks, using the readWave function as follows:

tr1_1 = readWave("TR1_edit.WAV", from = 0, to = 0.5, units = "minutes")
tr1_2= readWave("TR1_edit.WAV", from = 0.5, to = 1, units = "minutes")
tr1_3= readWave("TR1_edit.WAV", from = 1, to = 1.5, units = "minutes")
tr1_4= readWave("TR1_edit.WAV", from = 1.5, to = 2, units = "minutes")
tr1_5= readWave("TR1_edit.WAV", from = 2, to = 2.5, units = "minutes")

and so on. This method works, but is not efficient or pretty. Is there a way to import and split up a large .wav class file more efficiently?

dooogan
  • 57
  • 8

2 Answers2

2

If you're loading all of these into memory at the same time, rather than sequential variable names you should be using a list.

tr1 = list()
duration = 0.5
start_times = seq(0, 2, by = duration)

for (i in seq_along(start_times)) {
    tr1[[i]] = readWave('TR1_edit.WAV',
                        from = start_times[i],
                        to = start_times[i] + duration,
                        units = 'minutes')
}

This is the same principle as why you should use a list of data frames rather than sequentially named data frames.

You could easily wrap this into a function that takes the name of a WAV file as input, gets its length from the metadata, and imports it in 30-second (or a parameterized argument) segments, and returns the list of segments.

Community
  • 1
  • 1
Gregor Thomas
  • 104,719
  • 16
  • 140
  • 257
  • Thank you, that is a great solution. How would you go about getting the length of the file from its metadata? – dooogan Aug 20 '16 at 00:02
  • I don't know - never worked with WAV files -- but I assume there is an easy way. I did a quick search for "R WAV file length" [and got this which looks very helpful](http://stackoverflow.com/q/23415036/903061). You could always ask a new question about how to find the length of a WAV file without reading it in. – Gregor Thomas Aug 21 '16 at 04:43
  • @SeanHardison The duration of the file is simply the number of samples divided by the sample rate. If you read the WAV in its entirety first you can do: `duration – AkselA Aug 21 '16 at 12:39
0

@Gregor and @AkselA thanks for your input. The biggest issue with the for loop solution was that the wave files I'm working with are of varying sizes, so I would end up with blank elements in the resultant lists. My current solution imports the entire file, then breaks it up into 30s pieces from there:

duration = 1.44e6

tr1 <- readWave("TR1_edit.wav", from = 0, to = 1, units = "minutes")
tr1 <- as.matrix(tr1@left)
tr1 <- cbind(tr1, (rep(1:(length(tr1)/duration), each = duration)))
tr1 <- lapply(split(tr1[,1],tr1[,2]),matrix, ncol = 1)

From there I can use mapply to return the vectors to wave class

w <- function(s){
  Wave(s, right = numeric(0), samp.rate = 48000, bit = 16, pcm = TRUE)
}

tr1 <- mapply(w, tr1)
dooogan
  • 57
  • 8
  • Jotas solution from [here](http://stackoverflow.com/questions/20696681/split-an-audio-file-into-pieces-of-an-arbitrary-size) seems also to work well. – AkselA Aug 21 '16 at 18:50