EXOTIC SILICON
“Programming in pure ‘c’ until the sun goes down”
Loading bars simulator - Part two: Simulating the beam and generating waveforms
Material covered in this part
Working out the numbers
Time for some simple maths.
The eventual video output will be a series of PPM files at a resolution of 384 × 288, intended to be viewed at 50 frames per second. This represents the output that would be seen on a television that was adjusted to display as much of the picture as possible, right up to the edge of the active picture area but not including any of the blanking interval. In reality, most televisions cropped the picture to varying degrees and not always equally on each side, which means that our output will have a larger border area than would have been typically seen on a domestic television. However, to produce this visible picture area accurate we need to consider the dimensions of the full video signal including the blanking periods, as we need to calculate and then discard this part of the signal.
Considering the vertical blanking interval, things are fairly straightforward. Our 288 visible scanning lines simply come from the last 288 lines of each frame, so lines 0 - 23 are discarded, and lines 24 - 311 form our output. Horizontally, we need to consider the length of the horizontal blanking interval with regards to the length of each full scanline. Since we are scanning 312 lines, 50 times per second, our line frequency is 15600 Hz. The correct frequency for interlaced broadcast television is 625 × 25 = 15625 Hz, which gives a nice round figure of 1/15625 of a second or 64 microseconds per line. If we calculate 1/15600, we get a recurring decimal fraction, which is very close to 64. We'll use the round figure of 64 microseconds in our calculations for simplicity.
Timing accuracy, a pedantic sidenote...
(feel free to ignore this)
In fact, our round figure of 15625 Hz is actually correct, as the vertical refresh of the ZX Spectrum video signal isn't exactly 50 Hz anyway.
The reason for this should be obvious if we consider that the clock signal for the z80 cpu is 3.5 Mhz, which is ultimately derived by dividing the output of a 14 Mhz crystal. The same 14 Mhz clock signal is halved to produce the video pixel clock of 7 Mhz. Since the video signal is 312 lines per frame, that makes 312 × 50 = 15600 lines per 50 frames, which doesn't divide cleanly into 7,000,000. The nearest we can get to 50 Hz with integer dividers would be either 7,000,000 / 312 / 448 = 50.080 or 7,000,000 / 312 / 449 = 49.968.
The ZX Spectrum video signal is actually nominally 50.080 Hz vertical refresh
If we calculate 312 × 50.080 Hz, or more accurately, 312 × ( 7,000,000 / 312 / 448 ), we get exactly 15625. This means, that the inaccuracy of our timings really comes from the fact that we'll be treating the generated frames as 50 fps video, rather than as 50.080 fps video.
I'm mentioning this here mainly to avoid a deluge of email informing me that I'm wrong with my calculations. At the end of the day, any simulation or model has to balance accuracy with practicality, and the purpose of this code, apart from being a programming exercise, is to generate a useful video effect that could be used elsewhere. For that, I would prefer a round 50 Hz output anyway.
The horizontal blanking period of the signal is 12 microseconds, making the 'active' portition of the signal 64 - 12 = 52 microseconds. Since we want 384 'active' pixels across each line, we multiply 384 by 64 and divide by 52 to get 472. This is the total numbers of 'pixels' we have to consider when modelling the complete video signal including the blanking periods.
For reference, using the line frequency of 15600 Hz in place of 15625 Hz in the calculation would give us a value of 473, just one pixel more, so the difference really is small enough to ignore for all practical purposes.
In the same way that the vertical blanking interval covers the first lines of the frame, so the horizontal blanking interval covers the beginning of each scan line, so to get our 384 pixels, we simply use pixels 88 - 471 of each line, and discard pixels 0 - 87.
So the scanning of our full video signal including blanking periods can be simulated simply by considering it as a framebuffer of 472 × 312 pixels, and taking the 384 × 288 pixels that we want from the bottom right hand corner of it.
Now that we have our 384 × 288 pixel framebuffer representing the visible part of the television display, we need to work out whereabouts within that framebuffer the active area of 256 × 192 pixels of the spectrum's display falls, and where the border area falls.
Basically, we centre it. Each 256 pixel line of the Spectrum's display can be drawn on pixels 64-319, and we can start the 192 vertical lines at Y co-ordinate 48, measured from the top. In fact the Spectrum puts it at 40, but we'll use 48, because it looks better centered for the purposes of our video effect. Feel free to change this value to 40 if you so desire.
We'll start by setting the entire framebuffer to white. The 256 × 192 monochrome bitmap can be drawn in the central area, and the 256 × 192 24-bit RGB image can replace it afterwards. We'll store the 384 × 288 pixel framebuffer in the same 8-bit RGB format that we used in the starfield simulator project. This allows us to write it out to disk as a ppm file simply by adding a short header to the pixel data.
Writing the ppm files
Let's start with a simple function to write the framebuffer to disk as a ppm file. We'll also define some constants to hold the input and output locations in the filesystem:
/* Tape loading raster bars simulation, Exotic Silicon programming project 2 */
/*
Copyright 2021, Exotic Silicon, all rights reserved.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
1. This software is licensed exclusively under this specific license text. The
license text may not be changed, and the software including modified versions
may not be re-licensed under any other license text.
2. Redistributions of source code must retain the above copyright notice, this
list of conditions, and the following disclaimer.
3. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions, and the following disclaimer in the documentation
and/or other materials provided with the distribution.
4. All advertising materials mentioning features or use of this software must
display the following acknowledgement: This product includes software
developed by Exotic Silicon.
5. The name of Exotic Silicon must not be used to endorse or promote products
derived from this software without specific prior written permission.
6. Redistributions of modified versions of the source code must be clearly
identified as having been modified from the original.
7. Redistributions in binary form that have been created from modified versions
of the source code must clearly state in the documentation and/or other
materials provided with the distribution that the source code has been
modified from the original.
THIS SOFTWARE IS PROVIDED 'AS IS' AND ANY EXPRESS OR IMPLIED WARRANTIES,
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
EXOTIC SILICON BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT
OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
OF SUCH DAMAGE.
*/
#define INPATH "/input/"
#define OUTPATH "/output/"
/* Write the RGB framebuffer to a sequentially numbered file, and increment the counter. */
int output_frame(unsigned char * framebuffer)
{
static int current_frame=-1
char * filename;
int f;
asprintf (&filename, OUTPATH "frame_%04d.ppm", current_frame);
if (current_frame++!=-1) {
f=open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_EXLOCK, 0644);
write (f,&"P6 384 288 255 ",15);
write (f,framebuffer,384*288*3);
close (f);
}
return (0);
}
Notes on coding style
I'm using a somewhat different style of coding for this project to that which I used for the starfield simulator.
In the case of the starfield simulator, it was quite likely that we might want to change or expand the algorithms in the future, so it made sense to write the source code in such a way that facilitated easy modification, even if it became more verbose as a result. In contrast, the specifications for this project are much more tightly defined.
Not only that, but as we will see shortly, in most cases this project is not going to be a piece of code where you can typically change or tweak one value and get a useful result from it. Supporting changes will likely need to be made in various places, so a full understanding of what the code is doing will be necessary for anybody working on it.
With this in mind, the logic for this project will be very much hard-wired, with plenty of comments explaining exactly what the magic numbers do. As you will hopefully see, the source code should be more readable overall with this approach rather than defining a lot of verbose constants.
This function output_frame should be fairly self-explanatory. We pass it a pointer to a framebuffer, and it writes that data to disk with a ppm header which specifies fixed dimensions of 384 × 288, in 24-bit color. Filenames will be numbered from 0000 to 9999, which is plenty as we will only be writing up to about 3000 frames in total.
We start current_frame counting at -1 because we want to discard the first frame written by output_frame. The reason for this will become obvious when we look at the next function. Note that we declare output_frame as a static variable. The value only ever needs to be used within this function, but we need it to be maintained between each call.
I've used a single character variable name 'f' for the filehandle in this function. Generally, apart from loop counters, single character variable names reduce code readability, and are tedious when searching for them throughout the sourcecode. However, when they are only used in a single place, for an obvious and specific function such as a filehandle, it keeps the code compact without causing any confusion.
Simulating the beam
So now we finally get to what is probably the real heart, or at least a very important part, of this code, where a lot of the maths we discussed above is actually implemented:
#define beamy (beampos/472)
#define beamx (beampos%472)
#define framebuffer_y ((beampos/472)-24)
#define framebuffer_x ((beampos%472)-88)
/* Sets the pixel at the current "beam position" to the specified RGB color, if we are in the border area. */
/* If the beam is currently in the overscan area, or centre image area, nothing is done. */
int set_beam_pixel(unsigned char * framebuffer, int beampos, int red, int green, int blue)
{
if (beampos==0) { output_frame(framebuffer); }
/* Check whether we are outside the visible area, within the border area, or within the non-border image area. */
/* If we are in the overscan area, then do nothing and return. */
if ( beamy<24 || beamx<88 ) { return (1); }
/* If we are in the non-border image area, then do nothing and return. */
if (framebuffer_y>=48 && framebuffer_y<=239 && framebuffer_x>=64 && framebuffer_x<=319) { return (2); }
*(framebuffer+3*(framebuffer_y*384+framebuffer_x))=red;
*(framebuffer+1+3*(framebuffer_y*384+framebuffer_x))=green;
*(framebuffer+2+3*(framebuffer_y*384+framebuffer_x))=blue;
return (0);
}
This function is called with the address of the 352 × 288 output framebuffer, a 'beam position' value, and red, green, and blue pixel values to plot. The 'beam position' is simply a linear offset into the 472 × 312 overscanned area that we are simulating. We don't store framebuffer data for this larger area, but we have to take into account that the continually sweeping beam spends some of it's time within it.
Immediately upon entering this function, we check to see if the beam is at the top-left, having wrapped around from the bottom right. If so, we have a completed 384 × 288 frame in the framebuffer that we need to write to disk before we start to overwrite it's contents with new data. The output_frame function that we saw in the previous section is called to do this. Obviously, the very first call to set_beam_pixel will have the beam position set to zero, even though we haven't yet put any useful data into the framebuffer, which is precisely why we included the code to discard the first frame in output_frame.
The first two compiler macros convert the linear beam position into X and Y co-ordinates in the 472 × 312 space. These are checked to see whether we are in the overscan or blanking period. If we are, then we do nothing and return 1. The return value of 1 is not used by the calling function, but might be useful for debugging.
Next we check whether the beam position is within the border area, where the raster bars should be drawn, or whether it's in the central image area, in which case we again do nothing and just return. This function set_beam_pixel, only deals with drawing the raster bars, and leaves the task of drawing content in the central display area to another function.
The second two compiler macros convert the beam position into X and Y co-ordinates within the 352 × 288 output framebuffer, and we use those values to do the position check, as well as write the red, green and blue pixel values.
Setting up the main function
Now that we have got the basic border area plotting function in place, let's look at some code to initialize everything and start making use of it:
#include <stdio.h>
#include <fcntl.h>
#include <stdlib.h>
#include <unistd.h>
int main()
{
int beampos;
int n;
unsigned char *framebuffer;
framebuffer=malloc(384*288*3);
if (framebuffer==NULL) { printf ("Unable to allocate memory for framebuffer.\n"); return(1); }
for (n=0; n<(384*288*3); n++) { *(framebuffer+n)=255; }
beampos=0;
}
That will allocate memory for the 384 × 288 framebuffer, set all of the pixels to white, and set the beam position to zero. I've included all of the include directives that we will need for the final program, (just four), to show just how few library functions we'll actually be using.
During development, I added code to remove any ppm files written from a previous run:
int unlink_result;
char * filename;
unlink_result=0;
for (n=0; unlink_result==0; n++) {
asprintf (&filename, OUTPATH "/frame_%04d.ppm",n);
unlink_result=unlink(filename);
}
Note that in this case, simply overwriting any existing files on the next run isn't a good idea, because different screen images will result in tape waveforms of different lengths, depending on the number of binary 0s and binary 1s written. If any particular run of the program generates fewer output frames than the previous one, then some frames from the previous run will be left over, which is tedious if you're importing them into another program for display using shell wildcards.
The first real output
We need a function to generate pure white pixels in the border area, because there is a short pause between the header and data blocks. We can also use this to generate a few blank frames at the beginning of the sequence, so that our video effect doesn't start immediately.
/* Output a white border for the specified number of t-states. */
/* Each t-state represents (472*312*50)/3500000 pixels, or approximately 2.1 pixels per t-state. */
int video_silence(unsigned char * framebuffer, int beampos, long t)
{
int n;
for (n=0; n<(t*472*312*50/3500000); n++) {
set_beam_pixel(framebuffer,(beampos+n)%(472*312),255,255,255);
}
return ((beampos+n)%(472*312));
}
This is trivial, but it's worth taking a look at exactly what is going on inside this function, as the principles will be similar for the other two video generation functions that we're about to see.
We call video_silence with the address of the framebuffer, and the current beam position. We also specify the time to output a steady white border in T-states, the clock time of the z80 cpu. Remember that the central screen area of 256 × 192 pixels need not be pure white, this function is only concerned with the generation of the border color. We'll also use this function at the end of the sequence to hold the central image on screen for viewing without any raster bars.
If more than one frame's worth of white border is requested, the output_frame function will be called from set_beam_pixel, we'll get a ppm file written to disk, and the frame counter incremented.
The return value of video_silence is the new beam position, so a typical invocation will look something like:
beampos=video_silence (framebuffer, beampos, 3500000);
This will output exactly 49 ppm files. It doesn't output 50 as you might expect, because the data for the last frame was written to the framebuffer, but output_frame was not called because we didn't try to write the first pixel of the next frame.
During early development, I added a final call to output_frame to flush this final frame to disk, but the best solution is simply to call video_silence for a few frames at the end because calling output_frame manually will duplicate any video data for lines below the current beam position from the previous frame, which is obviously wrong. You can test changing 3500000 to 3500001, and observe that now the complete 50 frames are created, because 3500001 T-states takes us to the first pixel on the 51st frame.
Generating the pilot tone waveform
Our first real output wasn't particularly exciting, given that it was just a series of pure white ppm files. However, now we have the framework necessary to start producing more interesting sequences.
Here is the code that we need to produce the waveform for the leader or pilot tone:
/* Each high/low pulse lasts for 2168 t-states. */
/* 2168*(472*312*50)/3500000=4560.9, so we use a figure of 4561 overscanned "pixels". */
/* 3500000/2168 gives 1614.39 pulses per second, so we use 1614 for a round number. */
int video_leader(unsigned char * framebuffer, int beampos, int duration_seconds)
{
int n,m,s;
m=0;
s=0;
for (n=0; n<(((duration_seconds*1614)|1)*4561); n++) {
if ((n % 4561)==0) { m=1-m; }
set_beam_pixel(framebuffer,(beampos+n)%(472*312),255*m,255*(1-m),255*(1-m));
}
/* The duration of the sync tone is 667 t-states for the first pulse, and 735 t-states for the second pulse. */
/* In "pixels" this works out as 1403 and 1546. A total of 2949, with the high/low switching point at 1403. */
for (s=0; s<2949; s++) {
if (s==1403) { m=1-m; }
set_beam_pixel(framebuffer,(beampos+n+s)%(472*312),255*m,255*(1-m),255*(1-m));
}
return ((beampos+n+s)%(472*312));
}
Just like with video_silence, we call video_leader with the address of the framebuffer, the current beam position, and the duration of leader that we want. This time the duration is in seconds. This was an arbitrary choice, we could just as well have used T-states here, but when creating effects I tend to think in terms of seconds rather than T-states.
An important variable here, which is local to this function but will be used for the same purpose in several other functions is 'm'. This contains a boolean value indicating the current state of the tape output port, basically whether the square wave is 'up', or, 'down'. Remember that the exact phase doesn't matter at all, all that matters is where the 'edges' of the signal fall, the transitions from one state to the next. When m is zero, we output a cyan color to the border area, (green and blue on, red off), and when m is one, we output red.
As explained in the comments that start this block of code, each pulse lasts for 2168 T-states, and that corresponds to 4560.9 pixels being passed by our virtual beam sweeping across the 472 × 312 framebuffer. We round this figure to 4561, and output 808 sets of pulses for each second of leader requested.
The value of m is toggled within the loop by checking if n % 4561 = 0. It's very important indeed that we pay attention to the loops that do this, to avoid several types of error being introduced.
Firstly, off-by one errors. These could easily be introduced by toggling m immediately at the beginning when n=0, so we started with the opposite sense to what we intended. In case you're looking at the code and thinking that I've done exactly this, because we start the first loop with n=0, and then immediately check for (n % 4561)==0, notice that the starting value for m is zero. It will be immediately toggled to 1, which is the mark state of the line, and what we want the output to begin with, so in this case it was deliberate.
We also need to be sure that we are consistent if we are going to let the next part of the logic inherit the ending state of m. Either we need to toggle m after outputting the final value of the first step, such that the second step inherits it's correct starting value, or alternatively we need to avoid toggling m after outputting tht final value of the first step, and do it at the beginning of the next step. Doing both would doubly invert m leaving it unchanged, and we would then have two consecutive pulses with the same sense, which is wrong.
In the case of an off by one error like this, the difference in the visual waveforms would be almost impossible to notice, but the output would be wrong. More importantly, though, when we come to generate the audio waveform, the same error would likely introduce a brief click and an extra edge to the square wave. This would cause the waveform to fail loading on a real Spectrum.
For this reason, also, we ensure that we always create an odd number of pilot tone pulses. The construct (duration_seconds*808)|1 achieves this by setting the least significant bit. This in turn ensures that we end with the last pulse of the pilot tone having been in the mark state, with m=1, and can generate the first pulse of the sync tone with m=0, which is faithful to the original. If we created an even number of pilot tone pulses, then we would end with m=0, and would either have to invert the sync pulses, or generate an inaccurate waveform.
Of course, none of this matters for the purposes of creating a video effect, but similar issues may well arise in other programming projects that you take on, so it's a good example to point out.
The sync tone is the only tone that has unequal pulses. This doesn't cause any difficulty, we just convert the timings in T-states to the corresponding number of pixels to scan, and output the pulses using a similar loop to the one we used to generate the pilot tone.
Generating the data bits waveform
Finally we have arrived at the point where we can start encoding actual data bits. The code is not particularly complicated:
/* Draw the waveforms in the border area for each of the eight bits in the supplied byte. */
/* The duration of a binary 1 is 1710 t-states per pulse, and of a binary 0 is 855 t-states. */
/* For a binary 1, 1710*(472*312*50)/3500000=3597.4, so we use 3597 overscanned "pixels". */
/* For a binary 0, 855*(472*312*50)/3500000=1798.7, so we use 1799 overscanned "pixels". */
int video_byte_out(unsigned char * framebuffer, int beampos, unsigned char byte)
{
int bit,pixcount;
int beamoffset=0,n=0,m=0;
for (bit=0 ; bit<8 ; bit++) {
if (byte & 128) { pixcount=3597; } else { pixcount=1799; }
for (n=0; n<(2*pixcount); n++) {
set_beam_pixel(framebuffer,(beampos+beamoffset)%(472*312),255*(1-m),255*(1-m),255*m);
if (((n+1) % (pixcount))==0) { m=1-m; }
beamoffset++;
}
byte=byte<<1;
}
return ((beampos+beamoffset)%(472*312));
}
We call this function with the usual framebuffer pointer and beam position, but instead of a duration we just supply a single byte to use as the data to create the waveform. We convert the T-state timing values into numbers of pixels using the same formula as before, and shift each bit sequentially into the most significant bit to read it out.
The border colors are different now, with yellow for m=0 when the signal is in a space condition, and blue when m=1 and the signal is at mark. As before, the function returns the new value for the beam position, and handles the writing of full frames to disk automatically as required.
Transforming a real 17 byte header into raster bar waveforms
At last we can finally get to see something that actually looks like the raster bar pattern. All we need to do is to add two new variable declarations to the beginning of the main function that I introduced above, and the following ten lines of code to the end:
unsigned char checksum;
unsigned char header_byte;
beampos=video_silence(framebuffer,beampos,362000);
beampos=video_leader(framebuffer,beampos,2);
checksum=0;
for (n=0 ; n<18 ; n++) {
header_byte="\x00\x03MYFILENAME\x00\x1B\x00\x40\x00\x80"[n];
beampos=video_byte_out(framebuffer,beampos,header_byte);
checksum=(checksum^header_byte);
}
beampos=video_byte_out(framebuffer,beampos,checksum);
beampos=video_silence(framebuffer,beampos,3500000);
When run, this will generate 159 ppm frames, starting briefly with a white screen, followed by two seconds of the pilot tone, a total of 19 bytes of data represented by the blue and yellow bars, and finally more white screen.
Sync pulse0000000000000011
Looking carefully at frame 105, we can actually see all of the waveforms at the same time. The top of the screen shows the end of the pilot tone, with it's equally spaced red and cyan bars. Then we see the short unequal pulses of the sync, the first cyan pulse being slightly shorter than the second red one, and below that we can see the data bits. The first two bytes we encoded were 0 and 3, where 0 is the flag byte indicating that the data that follows is actually a header, and 3 is the first byte of the header itself. In binary, these bytes would be 00000000 00000011. Counting the pairs of yellow and blue bars down the screen, we can indeed see that there are 14 thin pairs, followed by two wider pairs. Those are our first two bytes successfully encoded!
Notice that the transitions from one color to another occur in arbitrary places across the screen. The bars don't match up with whole scanlines of the display. Obviously this is to be expected, given that the audio and video signals have no common clock, but it's an interesting detail that helps to give the display it's characteristic look, and which would not be reproduced if we simply tried to create the effect by drawing random lines in the border area. Taking all of the video timings into account is the only way to produce a truly authentic simulation.
A closer look at the header data and checksum
The header is quite simple. As previously mentioned, the first byte is technically not part of the header, as a corresponding byte is found before the block of user data, except that it's 0xff rather than 0x00. The 0x03 byte indicates that the user data in the following data block is an arbitrary dump of memory contents, rather than, for example, a basic program. The following ten bytes are, unsurprisingly, the filename. After this we have three more 16-bit values in little-endian format, the length of the data, the memory address to load it at, and finally an arbitrary 0x8000, (0x00 0x80 in little endian format), as this third parameter is not used for blocks of code.
As we process the individual bytes that we are encoding, we also checksum them. The checksum is basically the xor of all of the bytes, and is encoded in the normal way immediately after the first 18 bytes.
Summary of part 2
In this part of the project, we've written functions to simulate the movement of the video beam across the screen, and output ppm files for each frame. We've also successfully encoded pilot tones and a small amount of binary data in the form of the header.
In part three, we'll see how to proces image data that we read from a pbm file, into the correct format to simulate the loading of a monochrome bitmap into the framebuffer memory of a ZX Spectrum.
It's all my own work!