A simple way to read and write audio and video files in C using FFmpeg (part 1: audio)

C is my favourite programming language and the one I use most often. However, I have tended to shy away from using it for quick one-off signal processing tasks where I needed to read or write audio or video files because it always seemed like a lot of hassle figuring out which library to use and what functions to call to actually get access to the raw data. Recently though, I’ve found a new way of dealing with audio and video files that is relatively painless, so I think it’s worth sharing.

I first learned of this approach from an article on Zulko’s fascinating Python oriented blog on GitHub:

The article describes using FFmpeg to read and write video frames in Python and there’s a link to a second article showing the same thing for audio.

The basic idea is to launch FFmpeg via a pipe, which then converts raw samples to the required format (for writing) or decodes the file into raw samples (for reading). It may seem like a bit of a hack, but it’s surprisingly effective and, provided that you can figure out the correct FFmpeg command line, extremely adaptable.

FFmpeg is described on its project web page as “[a] complete, cross-platform solution to record, convert and stream audio and video.” That’s a pretty accurate description. So far, I haven’t needed to read or write any file format that it couldn’t handle, although in some cases it did involve some googling to figure out the required command line arguments.

FFmpeg is free software and cross-platform, so you should be able to install it on Windows and Mac without too much difficulty, but I haven’t actually tried that myself. I’m working on Xubuntu Linux, so to install FFmpeg I just did this:

sudo apt-get install ffmpeg

Ok, time for some practical examples…

Writing a WAV audio file (or MP3, FLAC, or whatever format you like)

The following example creates a 1-second long WAV audio file (PCM, 16-bit signed integer samples, 44.1 kHz sampling frequency). The sound is just a loud 1 kHz sine wave.

//
// writewav.c - Create a wav file by piping raw samples to ffmpeg
// Written by Ted Burke - last updated 10-2-2017
//
// To compile:
//
//    gcc writewav.c -o writewav -lm
//

#include <stdio.h>
#include <stdint.h>
#include <math.h>

#define N 44100

void main()
{
    // Create audio buffer
    int16_t buf[N] = {0}; // buffer
    int n;                // buffer index
    double Fs = 44100.0;  // sampling frequency
    
    // Generate 1 second of audio data - it's just a 1 kHz sine wave
    for (n=0 ; n<N ; ++n) buf[n] = 16383.0 * sin(n*1000.0*2.0*M_PI/Fs);
    
    // Pipe the audio data to ffmpeg, which writes it to a wav file
    FILE *pipeout;
    pipeout = popen("ffmpeg -y -f s16le -ar 44100 -ac 1 -i - beep.wav", "w");
    fwrite(buf, 2, N, pipeout);
    pclose(pipeout);
}

General comments:

  • I used the fixed length int data type int16_t for the audio buffer just to make absolutely sure I got 16-bit signed integer samples. That type is defined in the stdint.h header file. You can use a different data type, provided it matches the format your writing.
  • In the single-line for loop where I generate the sine wave, I used M_PI which is actually just the value of π (the mathematical constant). M_PI is defined in the math.h header file. Because I used M_PI, I had to link to the math library, which is why the suggested compiler command line includes the -lm option.
  • The popen() function launches a separate program which is accessible via a “pipe” using C’s standard file i/o functions. The first argument to popen is the command line for the program being launched (in this case FFmpeg). The second argument to popen specifies whether we’ll be reading data from that program through the pipe or vice versa. In this case, the second argument is "w" which means that we’ll be sending data to FFmpeg through the pipe. The return value from popen is a file descriptor for the pipe.
  • In this example, the fwrite() function is used to send the entire contents of the audio buffer into the pipe.
  • Finally, the pclose() function is used to close the pipe.

The FFmpeg command line:

The first argument to the popen() function is a string containing the full FFmpeg command line. The specified options included in that command line control how FFmpeg will interpret the raw sample data it receives through the pipe and the type of file it will write.

  • The "-y" option tells FFmpeg that it can overwrite the specified output file if it already exists.
  • The "-f s16le" option tells FFmpeg that the format of the audio data it reads (from its standard input, which means via the pipe from our program) is raw PCM, signed integer, 16-bit and little-endian.
  • The "-ar 44100" option tells FFmpeg that the sampling rate (i.e. sampling frequency) of the audio data it reads is 44.1 kHz.
  • The "-ac 1" option tells FFmpeg that the number of channels in the audio data it reads is 1.
  • The "-i -" option tells FFmpeg to read its input from standard input, which in this case means from the pipe, since FFmpeg was launched by popen.
  • Finally, beep.wav is the output filename FFmpeg will use.

If you want to hear the output file, you can download it here: beep.wav

Reading a WAV audio file (or MP3, FLAC or whatever format you like)

The following example reads 20 ms of samples from the beginning of a WAV file called whistle.wav (click it to download the file), then prints the sample values to a CSV file. The WAV file format is 16-bit signed integer samples, mono, with a sampling frequency of 44.1 kHz.

//
// readwav.c - Read samples from a WAV file using FFmpeg via a pipe
// Written by Ted Burke - last updated 10-2-2017
//
// To compile:
//
//    gcc readwav.c -o readwav -lm
//

#include <stdio.h>
#include <stdint.h>

#define N 882

void main()
{
    // Create a 20 ms audio buffer (assuming Fs = 44.1 kHz)
    int16_t buf[N] = {0}; // buffer
    int n;                // buffer index
    
    // Open WAV file with FFmpeg and read raw samples via the pipe.
    FILE *pipein;
    pipein = popen("ffmpeg -i whistle.wav -f s16le -ac 1 -", "r");
    fread(buf, 2, N, pipein);
    pclose(pipein);
    
    // Print the sample values in the buffer to a CSV file
    FILE *csvfile;
    csvfile = fopen("samples.csv", "w");
    for (n=0 ; n<N ; ++n) fprintf(csvfile, "%d\n", buf[n]);
    fclose(csvfile);
}

A screenshot of the resulting CSV file, samples.csv, open for viewing in my text editor is shown below.

samples-csv_geany_screenshot

This example program shares many elements with the previous one, but there are a couple of noteworthy differences:

  • The FFmpeg command line specified in the first argument of the popen() function tells FFmpeg to read its input from the file whistle.wav (the exact format will be detected automatically) and write its output to stdout as raw samples (16-bit signed integers, little-endian).
  • The second argument to popen is "r", which means that our program will read FFmpeg’s standard output via the pipe (it was the other way around in the previous example).

To ensure that the raw samples coming from FFmpeg are the correct size for my audio buffer, I specified "-f s16le" as a command line argument to FFmpeg. This was actually redundant in my case, because the WAV file I happened to be using was already in that format. I decided to explicitly specify the format anyway, just to show that FFmpeg can convert to a specific format if desired.

Modifying a WAV audio file

This example reads, modifies and writes a WAV audio file. It actually launches two instances of FFmpeg. One opens the original WAV file ("12345678.wav") and passes raw 16-bit signed integer samples into this program via a pipe. The other receives modified samples from this program via a second pipe and writes them to another WAV file ("out.wav"). The modification that is applied to the samples is a simple tremolo effect (modulation of the signal amplitude).

//
// modifywav.c - Modify a WAV file using FFmpeg via pipes
// This example adds a crude tremolo effect to the audio
// Written by Ted Burke - last updated 10-2-2017
//
// To compile:
//
//    gcc modifywav.c -o modifywav -lm
//

#include <stdio.h>
#include <stdint.h>
#include <math.h>

void main()
{
    // Launch two instances of FFmpeg, one to read the original WAV
    // file and another to write the modified WAV file. In each case,
    // data passes between this program and FFmpeg through a pipe.
    FILE *pipein;
    FILE *pipeout;
    pipein  = popen("ffmpeg -i 12345678.wav -f s16le -ac 1 -", "r");
    pipeout = popen("ffmpeg -y -f s16le -ar 44100 -ac 1 -i - out.wav", "w");
    
    // Read, modify and write one sample at a time
    int16_t sample;
    int count, n=0;
    while(1)
    {
        count = fread(&sample, 2, 1, pipein); // read one 2-byte sample
        if (count != 1) break;
        ++n;
        sample = sample * sin(n * 5.0 * 2*M_PI / 44100.0);
        fwrite(&sample, 2, 1, pipeout);
    }
    
    // Close input and output pipes
    pclose(pipein);    
    pclose(pipeout);
}

In case you want to hear the original and modified WAV audio files, here they are:

Click here to continue to part 2, where I show how to apply the same approach to video processing.

Advertisements
This entry was posted in Uncategorized and tagged , , , , , , , , , , , , , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s