Analysing a wav (audio) file?

Forum

Forum
General C++ Programming
Analysing a wav (audio) file?

Analysing a wav (audio) file?

Apr 19, 2012 at 6:35am

Hi. I'm using win 7. Is there a way to get the waveform of a .wav file.
for eg. if the .wav file consists of a beep of frequency 1600 Hz then the wave form should be represented by y = A*sin(1600*2*pi*t)
I'm not concerned about plotting it,though.

I need the y coordinates(time -varying) of the waveform.

Last edited on Apr 19, 2012 at 11:11am

Apr 19, 2012 at 8:11am

TheDestroyer (441)

Are you asking about how to read a wave file or about how to graph on a window?

You have to learn how to decompose the problem you're dealing with to parts. Those 2 things are completely not the same.

For reading wav files, there are dozens of libraries for that, just google it. If you want to do it yourself, you have to be familiar with Fourier transforms and sampling data. And you have to be familiar with reading binary files.

For plotting, you may either export the data and plot it with some mathematical scripting program like Mathematica or Matlab or Maple;

OR,

you may design your own GUI, which is 10 times harder, and you have to be good in C++. And at least you have to be good at inheritance of classes.

You may use Qt, www.qt-project.org, which is the easiest library for GUI stuff with native C++. It's free and easy. And for plotting, you may use the library called Qwt, which offers many facilities for 2D cartesian plotting.

Apr 19, 2012 at 11:12am

sanyam (38)

I'm not concerned about plotting the wave form.
I want to decode the wav file into a time varying amplitude.
I want to read a wav file

Apr 19, 2012 at 12:07pm

TheDestroyer (441)

You can't do that, unfortunately. Because every time sample in the wav file doesn't contain a single frequency, but rather an amplitude for that single frequency.

The sound is created by combining all the frequencies multiples, n*omega, with different amplitudes. That's Fourier's theorem. You can choose a single frequency and get its amplitude.

The stuff you see in the movies and games about sound isn't completely representative. It probably plots just the average amplitude as a function of time.

http://en.wikipedia.org/wiki/Fourier_series

I hope you're good with Maths to grasp the concept correctly :-)

Last edited on Apr 19, 2012 at 12:07pm

Apr 19, 2012 at 12:30pm

sanyam (38)

I know only a little about wav format. Please correct me if I am wrong.
the wav does not contain a frequency but records the amplitude of the combine of all frequencies which is sampled regularly after a particular interval .
All I need is that amplitude(time varying)
After getting the amplitude I can do all sorts of splitting by fourier transform.

Apr 19, 2012 at 2:43pm

Disch (13742)

I'm not entirely sure what you're asking, myself.

the wave data is the 'y' coordinate, or the amplitude. The 'x' coordinate is just a function of time, so it's implied by the index of the sample.

So if you have a simple (and very quiet) square wave, like: 0000999900009999

The wave data is going to look like this (assuming 16-bit samples):


00 00 00 00 00 00 00 00  09 00 09 00 09 00 09 00
00 00 00 00 00 00 00 00  09 00 09 00 09 00 09 00

Last edited on Apr 19, 2012 at 2:44pm

Apr 19, 2012 at 3:16pm

sanyam (38)

u got it.
i want to understand the notation
0000999900009999(or whatever it is)

I analysed an actual wav file-(hex numbers)
52 49 46 46(RIFF) 24 08 00 00 57 41 56 45 66 6d 74 20 10 00 00 00 01 00 02 00 22 56 00 00 88 58 01 00 04 00 10 00 64 61 74 61(data) 00 08 00 00 00 00 00 00 24 17 1e f3 3c 13 3c 14 16 f9 18 f9 34 e7 23 a6 3c f2 24 f2 11 ce 1a 0d .........
the first 4 bytes are "RIFF" ,a format
now what am i supposed to understand from the data
00 08 00 00 00 00 00 00 24 17 1e f3 3c 13 3c 14 16 f9 18 f9 34 e7 23 a6 3c f2 24 f2 11 ce 1a 0d .........

what are the units pressure? pascal?atm?.

Apr 19, 2012 at 4:19pm

Disch (13742)

So you want a description of the wave file format?

https://ccrma.stanford.edu/courses/422/projects/WaveFormat/

The "samples" are the Y coordinates. The X coordinate is the sample index. If you plot the samples that way, it produces the sound wave.

Apr 19, 2012 at 4:58pm

sanyam (38)

Thank you very much ....

Apr 19, 2012 at 7:10pm

htirwin (1208)

#include <sndfile.h>

int main(){

    SNDFILE *SoundFile;
    SF_INFO SoundFileInfo;  
    double *Samples;
 

    SoundFile=sf_open("path to an audio file", SFM_READ, &SoundFileInfo);
    //open a file and put it's info into a struct "SoundFileInfo"

   Samples = new double[SoundFileInfo.channels * SoundFileInfo.frames];
   //allocate an array to hold the samples
   
    sf_readf_double(SoundFile, Samples, SoundInfo.frames);
   //fill the array with sample values, a frame equals on sample per channel 
   
  ...

  /*take note that the left and right values are interleaved.  So Samples[0] 
is the first sample for the left channel, and Samples[1], is the first sample
 for the right  together they make one frame, this is why you need to allocate number 
of frames times channels.

you can also read sample values as short, int, or float.  If you read as 
short, the range will be -32767 to -32768, which is the range of the 
type short, and in and int, then the values will be represented from the 
range of an int.  using sf_readf_double or float, your values will be between -1, and 1.

libsndfile also can write an array of sample values, to a new wav 
file.  To do this, you first have to create an 
*/

SF_INFO SoundFileInfoOut; 
//then fill in the information manually.

SoundFileInfoOut.channels=SoundFileInfo.channels;
SoundFileInfoOut.frames=SoundFileInfo.frames;
//etc. for all of the members of the struct

/*then you can use the sf write function with the struct containing
 the info as a parameter, the array of sample values, and the path to write it to. 

because the samples are interleaved, you might want to deinterleave
 so that you have two arrays each of length SoundFileInfo.frames.  Use
 can use a loop to put  put Samples[i], for 0 and all even i in one, and all odd i in another, or something to that effect.

before writing to a file, you need to reinterleave the samples.  
...
*/

 }

Edit & run on cpp.sh

To use libsndfile, you need to get the library, link to it, and include a header, sndfile.h.

http://www.mega-nerd.com/libsndfile/#Download

The sample rate gives you time time, if the sample rate is 44100, then there are 44100 samples per second.

It's actually pretty simple what makes up a sound wav.

If you want to mix two sound files together, all you have to do is add up their sample values.

Sound wavs are just sin wavs of a given range of frequencies added together.

The value of a give sample represents the amplitude of the wav. If you want the amplitude of of a given frequency, you need to do a fft.

I'm using fftreal, which is really simple, and does the fft for you. You input your sample values in one array, and another array of all 0's. What you get back is an array of real numbers, and an array of imaginary numbers Imaginary[i], and Real[i] represent a bin.

You must choose an fft size first, which is basically the resolution you want. So if you choose 1024, then you get an output of 1024 bins, but you only keep the first half. The magnitude of each bin is sqrt(imaginary^2+real^2), and the frequency which it is the magnitude of is (i*samplerate)/1024. To get the db from the magnitude, I think you do 20*log10(magnitude), in dbfs where the max is 0.

So if you want to know the dbfs, of a frequency, you can find it like this:

frequency=i*samplerate/1024 , solve for i.

Say i=300; db=20*log10(sqrt(imaginary[300]^2+real[300]^2))

I'm new to this and I think I'm correct about this, but I may have something off.

Last edited on Apr 20, 2012 at 1:45am

Topic archived. No new replies allowed.