Video shuttle - Part two: Supporting functions for input and output
Material covered in this part
Supporting functions
Reading the source video, audio and configuration files
Writing the output video and audio
Supporting functions
Now that we've worked out, at least in principle, the core formula that we need for our simulation, let's take a step back and write some of the supporting functions that we'll also need. After all, there is a lot to do other than just calculate where the noise bars should be.
First we need functions to read and parse the headers from the input files, as the information there will be used to calculate how much memory that we need to allocate, so we'll start by including the few system libraries that we'll need and defining some global constants for file paths, the global scale constant that we mentioned earlier, and some default values for image dimensions, incase we can't read useful data from the headers on the input files. Finally, we'll allocate a small scratch buffer, which will be used by various functions:
/* Video shuttle simulation, Exotic Silicon programming project 3 */
/*
Copyright 2021, Exotic Silicon, all rights reserved.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
1. This software is licensed exclusively under this specific license text. The
license text may not be changed, and the software including modified versions
may not be re-licensed under any other license text.
2. Redistributions of source code must retain the above copyright notice, this
list of conditions, and the following disclaimer.
3. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions, and the following disclaimer in the documentation
and/or other materials provided with the distribution.
4. All advertising materials mentioning features or use of this software must
display the following acknowledgement: This product includes software
developed by Exotic Silicon.
5. The name of Exotic Silicon must not be used to endorse or promote products
derived from this software without specific prior written permission.
6. Redistributions of modified versions of the source code must be clearly
identified as having been modified from the original.
7. Redistributions in binary form that have been created from modified versions
of the source code must clearly state in the documentation and/or other
materials provided with the distribution that the source code has been
modified from the original.
THIS SOFTWARE IS PROVIDED 'AS IS' AND ANY EXPRESS OR IMPLIED WARRANTIES,
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
EXOTIC SILICON BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT
OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
OF SUCH DAMAGE.
*/
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/stat.h>
#define INPATH "/input/"
#define OUTPATH "/output/"
/* Compile time defaults for file dimensions */
#define DEFAULT_WIDTH 352
#define DEFAULT_HEIGHT 288
#define SCALE 10
int main(int argc, char * argv[])
{
/* Allocate 4k scratch_buffer buffer for general use */
scratch_buffer=malloc(4096);
}
With those defined, we can move on to our first real function, to read the header information from the first input frame.
The filenaming convention that we'll use for the input frames is frame_0000.ppm - frame_9999.ppm. Eventually, we will be able to configure the starting frame in a configuration file, however, we'll set the initial parameters from frame_0001, as it's quite possible that an image sequence output from other software would be numbered starting with 1 rather than 0, in which case frame_0000 will not exist. In any case, all of the images should have the same dimensions.
/* Extract header information for width, height and depth from the ppm file with the supplied frame number. */
/* Returns the offset to the pixel data, having filled values for framewidth and frameheight in the int pointers supplied. */
/* Used to read the initial parameters from frame 0001, to allocate sufficient memory. */
int read_input_frame_header (int * frame_width, int * frame_height, int * frame_depth, unsigned char * scratch_buffer, int frame)
As you can see, this function returns several values by storing them in the integer pointers it's called with. The actual return value is the offset to the pixel data in the input file as supplied by the next function that we're going to look at:
/* Parse a ppm header, and return the offset to the start of pixel data. */
/* Additionally, store values for width, height, and depth in the int pointers supplied. */
/* If no valid ppm header is found, return 0, set width and height to DEFAULT_WIDTH and DEFAULT_HEIGHT, and set depth to 255. */
/* This is intended to allow processing of raw, headerless 8-bit RGB pixel data. */
int parse_ppm_header (unsigned char * scratch_buffer, int * width, int * height, int * depth)
{
int n;
int m;
int commentflag;
static int flag_depth_unsupported_shown=0;
*width=0;
*height=0;
*depth=0;
commentflag=0;
if (*scratch_buffer!='P' || *(scratch_buffer+1)!='6') {
printf ("Invalid header, doesn't begin with P6. Assuming raw pixel data!\n");
This is basically an enhanced version of the code that we used to parse the ppm headers in the previous project. Here, however, we are not just checking for a specific set of dimensions but reading the values and passing them back to the calling function. We do some basic sanity checking on the header and return an offset of zero if it doesn't seem to contain valid data. The important things to sanity check here are the frame width and frame height, as these will be used in a call to malloc later on. As they are signed integers, if an excessively large value was supplied they could overflow to negative, and that could easily cause a segmentation fault much later on. We don't really need to do as much checking of the depth value, as it's only used as a boolean indicator of being less than 256 or not.
If we detect a 16-bit ppm file, we advise the user that it will be downconverted to 8-bit, but we still accept it as input. To ensure that this warning message is only displayed once, and avoid flooding the console with the same error message if a whole series of 16-bit files is supplied, we use a static variable as a flag. This variable will be local to this function, but will retain it's value between calls, so it's ideal for use in this application.
Now that we have these two functions, we can start writing the code for the main function, which will start by calling them:
unsigned char * framebuffer_in;
unsigned char * framebuffer_out;
unsigned char * framebuffer_bars;
unsigned char * scratch_buffer;
int frame_width;
int frame_height;
int frame_depth;
/* Read frame dimensions from first ppm file, and allocate memory for the framebuffers */
For brevity, I haven't included code to check that the various malloc calls succeed, but obviously for a program that is intended for widespread use you should do.
You'll notice that we've allocated a third framebuffer, in addition to the input and output framebuffers, and that I've added a call to a function generate_color_bars. The idea of this is that if for some reason some of the input frames are missing, for example if the tape is wound, 'past the end', of the input, then we can generate useful placeholder content for the output framebuffer. It was also useful during development, when I didn't want to keep reading and writing large ppm files. As you'll see shortly, we can also just substitute frame 0001 for any missing frames, but it's nice to cover all bases and give ourselves a choice.
Let's look at the generate_color_bars function now:
/* Fill the framebuffer with the specified parameters with RGB values representing color bars. */
/* This function is slow, so should be called once to set up this framebuffer, which should then be re-used when required for best performance. */
void generate_color_bars (unsigned char *framebuffer_bars, int frame_width, int frame_height)
{
int n,m;
for (n=0; n<3*frame_width*frame_height; n++) {
m=(n/3)%frame_width;
m=m/(frame_width/8);
if (n % 3==0) { *(framebuffer_bars+n)=32+192*((m & 2)==0); }
if (n % 3==1) { *(framebuffer_bars+n)=32+192*((m & 4)==0); }
if (n % 3==2) { *(framebuffer_bars+n)=32+192*((m & 1)==0); }
}
return ;
}
It's quite simple. The variable n iterates over the entire framebuffer, and the value of m is calculated to cycle between 0 and 7 on each line. The RGB values are for each pixel are then determined from the lower three bits of m, with bit 0 being blue, bit 1 being red, and bit 2 being green.
You might wonder why we even need to pre-calculate the colorbars framebuffer, and why we couldn't just run this loop across the normal output framebuffer every time that we needed to. During development I actually tried that, but the code ran noticably slower and performance was quite bad overall. Pre-calculating and storing the RGB pixel data for the color bars is much more efficient.
The next thing to do is to read our configuration file. We'll need to add some more variable definitions to the main function first, though, and we'll also add a printf call to display some diagnosic information:
unsigned char * guide;
int total_output_frames;
int fps;
int tape_velocity;
int tape_position;
int flag_stutter_on_pause;
int flag_random_noise_bars;
printf ("Video output: %d x %d, 8-bit RGB. %s\n", frame_width, frame_height, (frame_depth>255 ? "Note: input is 16-bit." : ""));
These new variables will all be initialized with values read from the configuration file. The format of the configuration file is quite simple, as the following example shows:
Config file
0090
30
+010
0001
.*
..............................>>>>>>>>>>
The first line is the total number of frames to output. The second line is the frames per second for the output. The third line is the initial tape speed, defined in terms of the global scale factor, so if SCALE is defined as 10, then 0 would mean paused, 10 would mean normal running speed, 20 would mean 2x playback, -5 would be half speed reverse playback, and so on. The last number is the starting point on the tape in frames.
To make the parsing of the configuration file easier, these fields are fixed in width. The total number of frames and the starting point should be zero padded if the values are less than 1000. The field for frames per second is only two digits wide, as we are unlikely to require frame rates above 99. The initial tape speed is three digits, prepended by a sign to indicate forward or reverse. Using a scale of 10, this allows us to start at up to 99.9 times normal playback speed, and even using a scale factor of 60, it's sufficent to specify up to about 16 × normal speed.
Obviously these constraints only concern setting the initial values, and during execution of the program the values can happily exceed them. If the need arose to support larger initial values, the format of the configuration file could easily be changed to something more flexible.
The fifth line of the configuration file sets or unsets two flags. The flags control the behaviour of the audio when the tape is paused, and also whether the noise bars are rendered as solid bars or as random noise. An asterisk in the configuration file at this position sets the flag, and any other character leaves it unset.
The last line of the configuration file is of variable length. It's a representation of the change in velocity of the tape over time, one character for each output frame. A dot signifies no change in velocity, so the first 30 output frames will maintain the initial velocity set on line three, in other words normal speed playback. For the next ten frames, the tape is accelerated by one unit for each frame. Assuming that we have SCALE set to ten, this means that over the course of ten frames, the tape will accelerate to 2× normal playback speed. As the configuration file ends here, no further changes in velocity will be made, and our simulated tape will continue to play at 2× normal speed for the rest of the 90 frames that we configured in line one.
This choice of format for the configuration file is fairly arbitrary, and was chosen over a more complicated free-form style mainly to reduce the complexity of the code needed to parse it during the development phase.
Here is the code for the read_guide_file function:
/* Read tape speed guide. */
/* Four bytes total frames in ASCII, followed by 0x10. */
/* Two bytes frames per second in ASCII, followed by 0x10. */
/* Four bytes initial speed -/+ and three ASCII digits, followed by 0x10. */
/* Four bytes initial position, four ASCII digits, followed by 0x10. */
/* Two bytes that set flags: stutter on pause, and render noise bars as noise */
/* If set to * the flag is enabled. If set to any other character, it is disabled. Flags are followed by 0x10. */
/* The remainder of the file should consist of <, >, [, ], {, }, or space characters to control the tape speed. */
void read_guide_file (unsigned char ** guide, int * total_output_frames, int * fps, int * tape_velocity, int * tape_position, int * flag_stutter_on_pause,
int * flag_random_noise_bars, int frame_width, int frame_height)
{
int infd;
int i;
struct stat filestat;
*total_output_frames=300;
*fps=30;
*tape_velocity=0;
*tape_position=0;
*flag_stutter_on_pause=1;
*flag_random_noise_bars=0;
infd=open (INPATH "guide", O_RDONLY);
if (infd==-1) {
printf ("Unable to open guide file. Setting default parameters.\n");
*guide=malloc(21+*total_output_frames);
for (i=0; i<21+*total_output_frames; i++) { *(*guide+i)='>'; }
return ;
}
fstat (infd, &filestat);
if (filestat.st_size<20) {
printf ("Guide file is invalid. Setting default parameters.\n");
*guide=malloc(21+*total_output_frames);
for (i=0; i<21+*total_output_frames; i++) { *(*guide+i)='>'; }
printf ("Guide only contains directions for %lld frames.\n", filestat.st_size-21);
*guide=realloc(*guide, *total_output_frames+21);
}
return ;
}
Note the use of double indirection passing the pointer to the memory allocation for the configuration file between the main function and read_guide_file. The amount of memory to be allocated depends on two things, firstly the filesize of the configuration file itself, and secondly the total number of frames indicated in it's first value field. If the total number of frames is greater than the number of tape speed indications in the configuration file, we allocate extra space to hold those unspecified values. This isn't strictly necessary, as we could just generate the required parameters within the function that is going to parse that data later on, but that would mean storing another value to indicate where the end of the supplied data was and would complicate that other function, which otherwise simply needs to read a single byte from memory for each frame.
Since the memory allocation is being done within read_guide_file, we define an unsigned char pointer in main, and pass a pointer to that pointer to read_guide_file, so that it's value, (which at this point is essentially random, a wild pointer), can be updated to reflect the actual address of the memory allocation. This address can then be used as normal once we are back in the main function.
This does lead to some unusual looking constructs such as *guide+1. This syntax is very often a mistake, when the programmer intended to use *(guide+1). The first construct evaluates to the value of pointer guide, incremented by one, as indirection has a higher precedence than addition. The second construct evaluates to the value of the next pointer in memory after guide. In our case though, within read_guide_file, guide is a pointer to a pointer, and what we want to do is to dereference the memory location after the one pointed to by that second inner pointer. That memory location will be *guide+1, which could also be written as (*guide)+1, for clarity, and we dereference it with another indirection operator, giving *(*guide+1), so we are effectively saying, "the contents of the memory location one after the location pointed to by guide".
Style notes
In case you're wondering why I refer to the configuration file as 'guide' throughout the source: during early development this file only contained the indicators for the tape speed, and the other parameters were hard-coded. It was literally just a guide to the tape movement. Eventually I moved the other configuration parameters into the same file, but the name was convenient and so for this reason I kept it.
The simple method shown of parsing ASCII digits by subtracting the base value for the ASCII code of '0', or 48 decimal, is convenient for development purposes, but should obviously not be used in production code without further checking that the characters are indeed digits. By filling the first four bytes of the configuration file with 0xFF bytes, the code above would store 207*1000+207*100+207+10+207 = 229977 as the value for total_output_frames, which is larger than the maximum that we might expect of 9999. However, storing the bytes 0x00, 0xff, 0xff, 0xff, in the configuration file will cause the code to evaluate -48×1000+207×100+207×10+207=-25023. This could cause serious problems if we later pass this negative value unchecked to malloc.
The correct way to deal with all of these issues is to write a full and comprehensive parser for the configuration file, (and by extension, any user-supplied input), as we did for the ppm files.
Since this is code in development, for the time being we can just add some range checks for the final values:
This still allows non-digits characters in the configuration file to be parsed as digits and set unusually large values, but those values will then be clamped to the expected range before they are used for anything later in the code.
Let's print out some debugging information informing us of the values that we've just read from the configuration file for good measure:
printf ("From guide file: total output frames: %d, frames per second: %d, initial tape speed: %d, initital tape position in frames: %d.\n",
We'll read each of the input video frames that we need to process in to memory as we need them. One frame of uncompressed RGB pixel data at 1920 × 1080 pixels is about 6 Mb, which works out to almost 10.5 Gb of video input for 60 seconds of material at 30 fps. Once the program is working satisfactorily we could certainly implement some kind of data caching algorithm to keep recently used frames in memory and avoid the need to re-read them from disk if they were needed again. However, during development of the code, this additional code complexity is best avoided.
The following function, read_input_frame reads the requested frame in to the supplied pre-allocated framebuffer, optionally substitutes a different frame if the requested one was not readable, checks that the dimensions and color depth are as expected, and downsamples 16-bit images to 8-bit:
/* Read the specified frame into the framebuffer at the address supplied. */
/* If the file does not exist or cannot be opened, optionally substitute frame 0001 for the missing frame. */
/* This behaviour is controlled by the substitute_flag parameter. */
/* If the flag is unset, or frame 0001 itself cannot be loaded, then return a value of 1, indicating failiure. */
/* If the file can be read, but it's header information indicates that it's parameters do not match those supplied, then also return 1, indicating failiure. */
/* The calling code can then use this return value of 1 as a signal to use the data in the other framebuffer that contains RGB pixel data for color bars. */
/* Returns 0 on success, 1 on failiure. */
int read_input_frame (unsigned char * framebuffer_in, int frame_width, int frame_height, int frame_depth, unsigned char * scratch_buffer,
If the requested frame can't be loaded, and the substitute flag is either not set or frame 1 also can't be loaded, then the function returns 1 to indicate that the framebuffer doesn't contain valid data and that the alternative input framebuffer containing the color bars pixel data should be used instead. This value is also returned if the requested image can be read but the dimensions indicated in the header are different to those expected. In this case, too, the frame will be substituted with color bars.
As is, the code above checks the color depth of the image just read, and will error out in the same way if it doesn't match the expected value. We could remove this check, changing the if line to just:
if (frame_width!=file_width || frame_height!=file_height) {
This might be useful if, for example, you had a sequence of 16-bit images, and had edited one of them in some way with a program that only supported 8-bit images.
Next we need to load the input audio data, so in the next section, we'll see the functions for parsing the audio file header.
Reading and processing au file headers
First some more variable declarations in main():
unsigned char * audiobuffer_in;
unsigned char * audiobuffer_out;
unsigned int sample_rate;
unsigned int channels;
unsigned int audio_format;
int bytes_per_sample; /* The total number of bytes per sample, considering all channels, I.E. 16-bit stereo would be 4 bytes per sample */
int audio_in_total_samples;
Note the use of unsigned types for sample_rate, channels and audio_format. These values will be read from 32-bit words in the header, and although it makes little sense to have anywhere near 2^31 channels or a sample rate of 2^31 Hz, such values would be valid. By contrast, it makes no sense at all to interpret the number of channels or sample rate as negative.
Next we have the code in main that calls the new function:
/* Read audio file header, set parameters from header data, allocate sufficient memory, and read the audio data. */
printf ("Audio output: %d Hz, %d bit, %d channel, samples per video frame: %d, bytes per audio sample: %d.\n", sample_rate, 8*(audio_format-1), channels, samples_per_frame, bytes_per_sample);
Just like the ppm header reading function parse_ppm_header, the new function read_input_audio will also return various parameters from the header by storing them at the addresses referenced by the supplied pointers. We don't need any double indirection here, as the pointer to the audio buffer is returned as the actual return value of the function. If an error occurs and we can't read the header information, read_input_audio will return a null pointer. We check for this in the main function, and set some sensible default values of 48 Khz, 8-bit, single channel audio in that case.
Since at this point we now know the output video framerate and total number of frames, (because that information was supplied in the configuration file), we can easily calculate how long the output audio is expected to be. The output audio format will be the same as the input audio format, unless there is no input audio in which case it will be the default settings that I mentioned above. Knowing this, we can calculate how much memory to allocate for the final audio output.
Again, our attention should be drawn to the malloc call that allocates audiobuffer_out, as it multiplies various user-provided values. This should be safe, as we range checked total_output_frames and fps earlier, and will do similar sanity checking of the values we return in read_input_audio:
/* Read the input audio file, parse the header, and return a pointer to the buffer containing the actual audio data. */
/* Call with pointers to ints to hold returned values for sample_rate, bytes per sample, number of channels, sample format and total number of samples. */
/* If the input audio file cannot be opened, or has invalid or unsupported header information, return a null pointer. */
unsigned char * read_input_audio(unsigned int * sample_rate, int *bytes_per_sample, unsigned int *channels, unsigned int *sample_format, int *audio_in_total_samples,
unsigned char * scratch_buffer)
{
int infd;
int n;
unsigned int data_start;
unsigned int data_length;
unsigned char * audiobuffer_in;
infd=open(INPATH "audio.au", O_RDONLY);
if (infd==-1) { printf ("Error opening audio.\n"); return (NULL); }
if (read(infd, scratch_buffer, 24)!=24) { close (infd); printf ("Audio file doesn't contain a valid header\n"); return (NULL); }
for (n=4;n--;) { if (*(scratch_buffer+n)!=".snd"[n]) { printf ("Audio file doesn't contain .snd file magic\n"); return (NULL); } }
We start by checking that we can open the input file and read at least 24 bytes from it. If not, we return a null pointer as a flag to indicate failiure. If we do read 24 bytes from the file, we then check that the header begins with the correct file magic. Note the use of a simple loop here to check the four characters rather than using the library function strcmp. This is the only time that we will need a string function in this progam, so it seems pointless to include the whole of of string.h and call a library function just for this trivial case.
Also note the way that the multi-byte values are read one byte at a time and shifted into the correct location. This ensures the correct result on both big-endian and little-endian machines.
The value we read from the header for data_length is passed directly to malloc, but this is fine because it can't be negative. Of course, the malloc could still fail, so we check for this and return a null pointer in that case.
If the audio encoding is something other than 8-bit, 16-bit, 24-bit or 32-bit linear PCM, we treat it as unknown. Identifiers for other encodings could easily be included, but these four are more than sufficient for testing purposes. The only processing that we will do with the audio data is to skip or duplicate samples, based on whether we are playing back at a faster or slower than normal speed. For this, the main parameter that we need to know is how many bytes are used to store each sample, which we can now calculate.
At this point, we have all of the functions that we need to get the necessary input into our program.
Video output
Once we have some data in our output framebuffer, we'll need to write it out to disk as a series of sequentially numbered ppm files. This is fairly straightforward and essentially no different to the code we've used in the previous two projects to do the same thing, apart from slightly more error checking:
/* Write a completed frame out to disk as a ppm file. */
int write_video_frame(unsigned char * framebuffer_out, int outframe, int frame_width, int frame_height)
if (write (outfd, headerout, headeroutsize)==-1) { close (outfd); return (1); }
free (headerout);
if (write (outfd, framebuffer_out,frame_width*frame_height*3)==-1) { close (outfd); return(1); }
close (outfd);
return (0);
}
We don't need to check any of the supplied parameters as they are all known to be within acceptable limits. The filename generation is done directly within this function, so we only need to supply the current output frame number as integer outframe.
Audio output
Writing the audio output buffer to disk as an au file is also very straightforward, as effectively all we need to do is prepend the simple header as we saw in the previous project:
int write_audio_buffer (unsigned int sample_rate, char * audiobuffer_out, int outframe, int fps, int bytes_per_sample, unsigned int audio_format,
The only real difference here is that whereas before the values for audio format, sample rate, and channels were fixed, now we use the values that we read from the header of the input file.
Summary so far
In this part of the project, we saw how to write the necessary support functions to read and parse the input data. We noted some important sanity checks that need to be done on the data to avoid unexpected behaviour later on. We also saw how to write our video output to disk as a sequence of ppm files and our audio buffer as an au file.
In the next part, we'll move on to the code that performs the actual simulation and creates the noise bars.