EXOTIC SILICON
“Get it down on wax! Or paper. Or a PPM file.”
Coding new ioctls to produce screendumps from the console
Screenshots are easy if you're running the X Window system or perhaps working within a virtual machine, but what about those times when you need to capture some console output straight from real hardware?
That's exactly what Crystal needed to do during her recent development work on the wscons code, and today she is here again to explain the whole process of adding a new ioctl to the kernel to make screenshots of the console possible.
If you're not interested in learning about the programming, (shame on you!), then scroll to the bottom to download some some ready-to-apply patches.
What we'll learn today:
Producing a screenshot, or as I prefer to call it a screendump given that we are dumping the 'raw' data from the display device, is at least in theory quite straightforward.
We just need to read the data from the framebuffer byte-by-byte, write it to a file, and then do any post-processing to make it usable.
The difficulty arises from the fact that on pretty much any modern desktop operating system, a user-mode program can't just directly access the hardware or kernel data structures in this way.
Certainly in the case of OpenBSD, we need to add some code to the kernel to make this all happen. And even if you're somewhat experienced with C programming it still might not be immediately obvious where exactly this code needs to be plugged in.
To make things worse, although we could in theory write a file containing our screendump data to the filesystem directly from within the kernel, this is generally considered a very bad idea. In fact, doing anything in the kernel that could be achieved just as well in userspace is usually a bad design choice.
So the correct way to solve this problem and add the missing feature is to write a small amount of kernel code to pass the required data to userspace, then code a separate regular userspace program to process it and finally write the data out to disk.
That might sound complicated, but fear not! It's actually fairly easy, so let's get straight to work implementing a new screendump facility.
Two types of screenshots
Since the console is character-based, there are actually two ways that we can record the information displayed at any moment:
Text-based screenshots
This involves copying the ASCII character at each location to a text file, and optionally preserving the attribute data, (such as underlining, bold text, and colors), in some way as well.
The advantages of this approach are that we end up with editable text, and that the actual amount of screendump data is quite small, typically around 4 - 8 kilobytes. It's also somewhat easier to implement, as we only need to deal with the wscons code.
The obvious disadvantage is that it doesn't preserve the exact 'look' of the screen, with the original font, exact original colors and way of rendering bold, italic, and so on.
Graphical screenshots
With a graphical screenshot, we'll be converting the actual pixel data from the framebuffer into a PPM graphics file.
This means that we can preserve the original font, as well as any rendering effects.
The files will be much larger, perhaps 4 - 8 megabytes, but the biggest disadvantage is the added complexity of actually implementing it, since we need to deal with both the wscons and rasops code.
In this article, we'll be looking at and implementing both of these approaches.
Common concepts and background information
In both cases, we'll be adding a new ioctl to the wscons subsystem. This ioctl can then be called from a standard usermode C program, and then kernel will dutifully copy the requested data into a buffer that the usermode program has pre-allocated and then passed to it.
If you're not familiar with the concept of an ioctl, you might want to read the manuals page for ioctl and wsdisplay before continuing with this article.
The current list of ioctls for wscons can be found in sys/dev/wscons/wsconsio.h, and the code to handle our new ones will be added to wsdisplay.c.
Text-based screenshots
Kernel code
The format of the character cell data in the wscons subsystem is fairly typical of a character based terminal. Two values are stored for each character, one being the ASCII character value, and the other being an attribute value. Technically, the character value is a unicode value rather than ASCII, but to keep things simple we'll treat the values as unsigned chars in the 0 - 255 range.
Within the existing wscons code, these pairs of values are usually stored in a struct wsdisplay_charcell.
To begin with, we can simply ignore the attribute value and copy the raw character values out.
First of all, we need to add a new structure to wsconsio.h:
struct wsdisplay_screendump_text {
int rows;
int cols;
unsigned char * textdata;
unsigned int * attrdata;
};
The rows and cols values will be filled in by our new piece of kernel code, and returned to userland so that we know how many characters make up the entire screendump as well as where the line breaks should come. The pointers will be malloc'ed in userland before calling the kernel code.
We also need to add a define for our new ioctl to wsconsio.h:
#define WSDISPLAYIO_SCREENDUMP_TEXT _IOWR('W', 160, struct wsdisplay_screendump_text)
The parameters we are supplying are:
The range of ioctl numbers 160-255 is marked as reserved for future use. Since our new ioctl operates on the display, it would have made more sense to use the 64-95 range, but this is already full.
The _IOWR macro is defined in sys/sys/ioccom.h, along with other ioctl-related macros. The choice of _IOWR, _IOR, or _IOW, determines whether the values that are defined in our struct wsdisplay_screendump_text are copied from userland to the kernel, from the kernel back to userland, or both.
Since in this case we need to pass pointers from userland to the kernel and receive the values for rows and cols back, we need data to move in both directions, so we use _IOWR.
Note that the actual content of textdata and attrdata will be copied from kernel memory space to our userland memory allocation by our own code, (specifically a call to the copyout kernel function), and not by this _IOWR mechanism, which is only responsible for passing the values of the pointers.
Now that we have a mechanism in place to call our new ioctl, we can add the code to implement it to wsdisplay.c. The function wsdisplay_internal_ioctl mostly consists of a large switch statement, and it is here that we can add our new routine:
case WSDISPLAYIO_SCREENDUMP_TEXT:
{
struct wsdisplay_charcell cell;
#define SCREENDUMP ((struct wsdisplay_screendump_text *)data)
#define DISP_SIZE (SCREENDUMP->rows * SCREENDUMP->cols)
unsigned char * text_data;
unsigned int * attr_data;
int i;
/* The values of the following assignments will be copied back to userspace upon exit. */
SCREENDUMP->rows=N_ROWS(scr->scr_dconf);
SCREENDUMP->cols=N_COLS(scr->scr_dconf);
text_data=malloc(DISP_SIZE, M_IOCTLOPS, M_WAITOK);
attr_data=malloc(DISP_SIZE*sizeof(unsigned int), M_IOCTLOPS, M_WAITOK);
for (i=0; i<DISP_SIZE; i++) {
GETCHAR(scr, i, &cell);
*(text_data + i)=cell.uc;
*(attr_data + i)=cell.attr;
}
/* Only need to use copyout for things passed as pointers */
copyout (text_data, SCREENDUMP->textdata, DISP_SIZE);
copyout (attr_data, SCREENDUMP->attrdata, DISP_SIZE*sizeof(unsigned int));
free (text_data, M_IOCTLOPS, DISP_SIZE);
free (attr_data, M_IOCTLOPS, DISP_SIZE*sizeof(unsigned int));
return (0);
}
In terms of kernel code, this is basically all that we need!
Points to note:
The kernel versions of the malloc and free functions have different semantics to their userspace equivalents.
Malloc requires us to specify a type argument, which is basically used for accounting and diagnostic purposes. The code above would still work whichever type was used, but the most relevant one seems to be M_IOCTLOPS. More importantly, the third flags argument must include either M_WAITOK or M_NOWAIT, otherwise the malloc call will trigger an assertion and panic the kernel. Refer to the manual pages, malloc(9), and free(9) for further details.
The GETCHAR macro is actually defined in wsmoused.h.
This is likely because it's mainly used in conjunction with displaying the mouse cursor and implementing copy and paste using the mouse.
On a machine with a framebuffer console that is using the rasops code, it ultimately calls the function rasops_getchar. This reads from the rs_bs array, which is a struct wsdisplay_charcell. In the case of the vga text mode driver, the function called is pcdisplay_getchar, via vga_getchar.
The copyout kernel function is what does the heavy lifting
This is the function that actually writes the data to usermode-accessible memory, ready for a userspace program to access it.
Updating /usr/include
Since we have changed a kernel header file which is also used by userland programs, we need to perform an additional step along with the normal kernel compile and installation.
The file /usr/src/sys/dev/wscons/wsconsio.h needs to be copied to /usr/include/dev/wscons/wsconsio.h:
# cp -p /usr/src/sys/dev/wscons/wsconsio.h /usr/include/dev/wscons/
Failure to perform this step will result in compilation errors for userland programs, as they will still be using the old version of wsconsio.h.
Userland code
At a minimum, all we need to do from userland is to call the new ioctl and write the data that it returns to a file:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/ioctl.h>
#include <sys/time.h>
#include <dev/wscons/wsconsio.h>
#include <fcntl.h>
int main()
{
int fd;
int res;
struct wsdisplay_screendump_text screendump;
screendump.textdata=malloc(65536);
screendump.attrdata=malloc(65536);
res=ioctl (STDIN_FILENO, WSDISPLAYIO_SCREENDUMP_TEXT, &screendump);
printf ("ioctl returned %d\nrows %d, cols %d\n", res, screendump.rows, screendump.cols);
if (res == -1) {
return (1);
}
fd=open ("screendump", O_WRONLY | O_CREAT | O_TRUNC, 0600);
write (fd, screendump.textdata, screendump.rows * screendump.cols);
close (fd);
return (0);
}
Here we are using the struct wsdisplay_screendump_text that we added to wsconsio.h, and creating a variable 'screendump' of that type.
The two pointers to memory that will be used to hold the userland copy of the screendump characters and attributes need to have memory allocated. If we wanted to allocate exactly the right amount of memory then we could make a separate call to find out the actual screen dimensions, however since the memory required is small in this instance, we just allocate 64K which should be enough to cover any reasonably-sized text console.
Note that since we are not using the attribute data, we could also just set the attrdata pointer to NULL:
screendump.attrdata=NULL;
This would cause the corresponding call to copyout to fail and return EFAULT, which is fine because our kernel code doesn't care and will just continue regardless.
Compiling and running the above code will result in the creation of a text file containing each of the characters from the display and nothing else. So in the case of a 160 x 49 display, we would get a file of exactly 7840 bytes.
That's not particularly useful, because it doesn't even contain newline characters at the end of each line. In other words, all of the content is on one long line of 7840 characters. As a result, if we viewed it on a display of a different size then the lines would appear to wrap at the wrong points.
We can easily modify our program above to add newlines at the right places, based on the value of screendump.cols:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/ioctl.h>
#include <sys/time.h>
#include <dev/wscons/wsconsio.h>
#include <fcntl.h>
int main()
{
int fd;
int res;
int i;
int j;
unsigned char * buffer;
struct wsdisplay_screendump_text screendump;
screendump.textdata=malloc(65536);
screendump.attrdata=malloc(65536);
screendump.attrdata=NULL;
res=ioctl (STDIN_FILENO, WSDISPLAYIO_SCREENDUMP_TEXT, &screendump);
printf ("ioctl returned %d\nrows %d, cols %d\n", res, screendump.rows, screendump.cols);
if (res == -1) {
return (1);
}
/* Write line-delimited output to a file */
buffer=malloc(screendump.rows * (screendump.cols + 1));
j=0;
for (i=0 ; i < screendump.rows * screendump.cols ; i++) {
*(buffer + j++)=*(screendump.textdata + i);
if ((i + 1) % screendump.cols == 0) {
*(buffer + j++)=10;
}
}
fd=open ("screendump.asc", O_WRONLY | O_CREAT | O_TRUNC, 0600);
write (fd, buffer, screendump.rows * (screendump.cols+1));
close (fd);
free (buffer);
return (0);
}
With this new version of the code we get a text file that has the correct number of rows and columns to match the original display.
Of course, each line of the output is still padded with spaces to the full length of the line, but this can also easily be fixed with a slight change to the main routine:
/* Write line-delimited output to a file, trimming trailing blanks from each line */
buffer=malloc(screendump.rows * (screendump.cols + 1));
j=0;
for (i=0 ; i < screendump.rows * screendump.cols ; i++) {
*(buffer + j++)=*(screendump.textdata + i);
if ((i + 1) % screendump.cols == 0) {
while (j>0 && *(buffer + j - 1)==32) { j--; }
*(buffer + j++)=10;
}
}
fd=open ("screendump.asc", O_WRONLY | O_CREAT | O_TRUNC, 0600);
write (fd, buffer, j);
Doing something with the attributes
The above works quite well for capturing text data that happens to be displayed on the console. But we're still missing the attributes, so we can't see text in different colors, and details such as bold and underlining are lost.
One solution is to parse the attribute data, and convert it into CSS that we can use together with an HTML encoded version of the text. This allows us to represent the whole display in a form that preserves virtually everything except the actual font.
Unfortunately, this process is complicated somewhat by the fact that the raw attribute data is device specific. The correct way to convert it back to device-inspecific values is to use the 'unpack attribute' functions, but doing so adds somewhat to the complexity of the kernel code. For this proof of concept we'll stick to assuming that the attributes are in the format used by the rasops code, which should cover the vast majority of users on the amd64 platform.
The following program produces a 'screendump.html' file in the current directory when run, which contains a fairly accurate rendering of the text currently on the console. It even includes support for 256 colors as implemented by some of the console enhancement patches I recently created.
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/ioctl.h>
#include <sys/time.h>
#include <dev/wscons/wsconsio.h>
#include <fcntl.h>
#include <dev/wscons/wsdisplayvar.h>
#define BLUE(x) (48*((x-16)%6))
#define GREEN(x) (48*(((x-16)/6)%6))
#define RED(x) (48*(((x-16)/36)%6))
#define HEX(i) ((i & 0xf) <10 ? (i & 0xf)+'0' : (i & 0xf)-10+'a')
#define GREY(i) (1+((i-232)*11))
void color_html(int cindex, unsigned char * return_col)
{
if (cindex < 16) {
*(return_col+3)=0;
*return_col=(cindex & 1 ? (cindex & 8 ? 'f' : '8') : '0');
*(return_col+1)=(cindex & 2 ? (cindex & 8 ? 'f' : '8') : '0');
*(return_col+2)=(cindex & 4 ? (cindex & 8 ? 'f' : '8') : '0');
if (cindex==7) {
*(return_col)='d';
*(return_col+1)='d';
*(return_col+2)='d';
}
if (cindex==8) {
*(return_col)='8';
*(return_col+1)='8';
*(return_col+2)='8';
}
return;
}
*(return_col+6)=0;
if (cindex < 232) {
*(return_col)=HEX(RED(cindex) >> 4);
*(return_col+1)=HEX(RED(cindex));
*(return_col+2)=HEX(GREEN(cindex) >> 4);
*(return_col+3)=HEX(GREEN(cindex));
*(return_col+4)=HEX(BLUE(cindex) >> 4);
*(return_col+5)=HEX(BLUE(cindex));
return ;
}
*(return_col)=HEX(GREY(cindex) >> 4);
*(return_col+1)=HEX(GREY(cindex));
*(return_col+2)=HEX(GREY(cindex) >> 4);
*(return_col+3)=HEX(GREY(cindex));
*(return_col+4)=HEX(GREY(cindex) >> 4);
*(return_col+5)=HEX(GREY(cindex));
return ;
}
int main()
{
int fd;
int res;
int i, j;
int pos_in;
int spanopen;
unsigned int cattr;
struct wsdisplay_screendump_text screendump;
unsigned char * buffer;
unsigned char * pos_out;
unsigned char * return_col;
screendump.textdata=malloc(65536);
screendump.attrdata=malloc(65536);
res=ioctl (STDIN_FILENO, WSDISPLAYIO_SCREENDUMP_TEXT, &screendump);
if (res == -1) {
printf ("Ioctl WSDISPLAYIO_SCREENDUMP_PIXELS not supported for this device\n");
return (1);
}
printf ("ioctl returned %d\nrows %d, cols %d\n", res, screendump.rows, screendump.cols);
/* Write HTML output to a file */
buffer=malloc(1024*1024);
pos_out=buffer;
spanopen=0;
#define LINEAR_POS (i*screendump.cols+j)
/* Write HTML preamble */
pos_out+=sprintf(pos_out, "<html><body><div style='font-family:monospace;white-space:pre;'>");
return_col=malloc(1024);
/* Process the output line-by-line */
for (i=0; i<screendump.rows; i++) {
pos_out+=sprintf(pos_out, "<div>");
for (j=0; j<screendump.cols; j++) {
if (j==0 || screendump.attrdata[LINEAR_POS]!=cattr) {
cattr=screendump.attrdata[LINEAR_POS];
if (j>0) {
pos_out+=sprintf(pos_out, "</span>");
}
pos_out+=sprintf(pos_out, "%s", "<span style='");
color_html((cattr >> 16) & 0xff, return_col);
pos_out+=sprintf(pos_out, "background:#%s;", return_col);
color_html((cattr >> 24) & 0xff, return_col);
pos_out+=sprintf(pos_out, "color:#%s;", return_col);
#ifdef ENHANCED_CONSOLE
pos_out+=sprintf(pos_out, "font-weight:%d;", (cattr & WSATTR_HILIT ? 800 : 400));
pos_out+=sprintf(pos_out, "text-decoration:%s;", (cattr & WSATTR_UNDERLINE ? "underline" : "none"));
#else
pos_out+=sprintf(pos_out, "text-decoration:%s;", (cattr & 1 ? "underline" : "none"));
#endif
pos_out+=sprintf(pos_out, "%s", "'>");
}
switch (*(screendump.textdata+LINEAR_POS)) {
case '>':
pos_out+=sprintf(pos_out, "%s", "&gt;");
break;
case '<':
pos_out+=sprintf(pos_out, "%s", "&lt;");
break;
case '&':
pos_out+=sprintf(pos_out, "%s", "&amp;");
break;
default:
*(pos_out++)=*(screendump.textdata+LINEAR_POS);
}
}
pos_out+=sprintf(pos_out, "</span></div>");
}
/* Write HTML trailer */
pos_out+=sprintf(pos_out, "</div></body></html>");
fd=open ("screendump.html", O_WRONLY | O_CREAT | O_TRUNC, 0600);
write (fd, buffer, pos_out-buffer);
close (fd);
free (buffer);
free (return_col);
free (screendump.textdata);
free (screendump.attrdata);
return (0);
}
The generated CSS is not particularly efficient, since it repeats all of the parameters every time there is any change in the attributes. Nevertheless, it works well as a proof of concept, and the results are certainly suitable for use on webpages.
Important note
The code required to correctly interpret the attributes depends on the version of OpenBSD that you are using, as some other work that I have done on the console code which changes the way that the attributes are passed to the rasops code has now been committed to -current:
If you are testing this code on OpenBSD 7.2-release, (or earlier), then leave ENHANCED_CONSOLE undefined.
If you are testing this code on a system built from -current sources after 20230118, or a future release version of OpenBSD, then define ENHANCED_CONSOLE.
Additionally, bold text will currently only be rendered as true bold, (rather than just a brighter color), in the html output if you have applied one of my kernel patches which causes the WSATTR_HILIT attribute to be passed through to the rasops code.
Security consideration
Obviously, the ability to take screendumps has potential security implications.
In these examples we are performing the ioctl on standard input, which does not require any special permissions. However, if the we are on a system which is using virtual consoles, then the code as written will dump the contents of the currently active console.
Consider the case where we invoke the screendump capturing code with a slight delay:
$ sleep 5 ; ./a.out
Then we can easily switch to another virtual terminal where somebody else might be logged in, and capture their current screen contents.
This can be done either with a command key combination, such as ctrl-alt-F1, or using wsconsctl:
$ wsconsctl -f /dev/stdout display.focus=0
If we have root access to the system, then we can open /dev/ttyC0, (or in fact any of the /dev/ttyC? devices), remotely via a network connection, and capture the contents of the currently selected virtual terminal. Note that ioctls on these devices act on the virtual terminal system as a whole, so for example, opening ttyC2 and performing a screendump won't automatically provide the contents of that virtual terminal.
Since both of these scenarios require either physical access to the console or remote root access to the system, they shouldn't really be considered exploits. The first example doesn't actually allow us to access any data that we can't already access, and whilst the second example does, it requires root access so it's scope for abuse is severely limited. However we should be aware of potential security issues if we decide to further expand the capabilities of the new ioctl.
Graphical screenshots
Whilst text-based screenshots are very useful, to capture the exact rendering of the screen we obviously need to produce a graphics file from a pixel-by-pixel readout of the framebuffer.
In principle, this isn't too much different from what we've already done above, except that we'll be reading RGB pixel data instead of ASCII character values and attribute bytes. However, we can't really do this directly from the wscons code because we need to access all sorts of data structures that are specific to the graphics hardware in use - there is no convenient equivalent to the GETCHAR macro to read individual pixel values.
In practice, for graphics hardware that is common on the amd64 platform this is all handled through yet another abstraction layer which is known as rasops, (raster operations). This effectively sits between the higher-level wscons code and the various low-level graphics drivers, and provides us with a unified way to access the framebuffer regardless of the actual hardware in use. The rasops code also implements a few other details such as the virtual terminals functionality, (in other words, if we're looking at virtual terminal 1 and a program outputs to virtual terminal 2, that output doesn't actually get painted to the screen), and display rotation, (which is very limited in scope and virtually unused on OpenBSD systems today).
So even though we only want access some parameters to determine the display size and the address of the framebuffer, then copy the pixel data out, we need to write a new rasops function.
To do this we need to delve deep into the rasops code, and here in lie some interesting surprises.
Setting up the second new ioctl
Just like with the first new ioctl that performs text-based screenshots, we need to add some code to wsconsio.h:
struct wsdisplay_screendump_pixels {
int width;
int height;
int stride;
int depth;
unsigned char * data;
};
#define WSDISPLAYIO_SCREENDUMP_PIXELS _IOWR('W', 161, struct wsdisplay_screendump_pixels)
This is all directly equivalent to the code we added before.
Instead of rows and columns, we have width and height in pixels, along with 'stride', which will be explained shortly but is basically the width in bytes rather than pixels, taking into account any padding and alignment. We also include the display bit depth so that we can eventually apply the correct post-processing to the pixel data to make it usable.
Since we're reading a single block of RGB pixel data, we only have one pointer this time rather than the two that we required before for the separate character and attribute values.
Calling an existing rasops function
Before we write our own new rasops function, let's look at how to call an existing one.
To actually call the rasops routines from the wscons code, we use the WSEMULOP macro. This is defined in wsemulvar.h, and accepts the following arguments:
None of this is really documented, except for the source code itself, so expect to do a lot of reading if you want to do anything new with the rasops code.
A simple but practical example might make things clearer:
case WSDISPLAYIO_SCREENDUMP_PIXELS:
{
int result;
struct wsemul_abortstate abortstate;
abortstate.skip=0;
WSEMULOP(result, scr->scr_dconf, &abortstate, putchar, (scr->scr_dconf->emulcookie, 10, 10, '!', 0x01020000));
return (0);
}
All this does is to call the existing rasops putchar routine with some hard-coded arguments. Specifically, row 10, column 10, character '!', and an attribute byte that specifies red text on a green background.
We can invoke this ioctl with a very simple userland program:
#include <stdio.h>
#include <unistd.h>
#include <sys/ioctl.h>
#include <sys/time.h>
#include <dev/wscons/wsconsio.h>
int main()
{
int res;
struct wsdisplay_screendump_pixels screendump;
res=ioctl (STDIN_FILENO, WSDISPLAYIO_SCREENDUMP_PIXELS, &screendump);
printf ("Ioctl WSDISPLAYIO_SCREENDUMP_PIXELS returned %d\n", res);
}
Running the above program should result in a single exclamation mark being plotted at row 10, column 10, along with the expected output from the printf statement.
Where is the pixel data actually stored?
The main thing that we want to know is the kernel virtual memory address corresponding to the start of the framebuffer. This, along with other useful information such as the height and width of the display, is stored in a struct rasops_info, as defined in rasops.h. Here in this definition we can see the pointer ri_bits, along with ri_width, and ri_height.
What is not necessarily so obvious is how we actually get to the relevant rasops_info from what we are supplied in the wsdisplay_internal_ioctl function where our ioctl kernel code will be inserted.
We're supplied with a pointer to a struct wsscreen, scr. This contains an element scr_dconf, which is a struct wsscreen_internal. This struct wsscreen_internal has an element emulcookie, which is a void pointer but which will point to a struct rasops_screen. Finally, this struct rasops_screen has a pointer to a rasops_info structure.
So the whole chain looks something like this:
struct wsscreen scr -> struct wsscreen_internal * scr_dconf -> void * emulcookie -> struct rasops_info * rs_ri -> u_char * ri_bits
Code quality
At this point, I'd like to take a moment out to say that the rasops code should absolutely not be used a guide to good C programming practices.
As previously mentioned, there is no real documentation. Whereas the manual pages for wscons and wsdisplay do actually explain some details of those subsystems, the manual page for rasops gives an incomplete and therefore inaccurate description of just one of the many structures used, a summary of ri_flag, and a brief synopsis of two functions. This is woefully insufficient documentation for anybody who is new to the rasops codebase, and so the only practical way to get an understanding of it is to study the source directly.
Whilst it's fair to say that the source code is in many ways the ultimate documentation, learning anything from it is not made any easier when it is sparsely commented, and the names of structures and variables don't clearly explain their purpose. To add insult to injury, there are also cases of confusingly similarly named variables with different but related purposes, such as can be seen with the various function pointers in rasops_init. We have a set of pointers under ri->ri_*, and a corresponding set under ri->ri_ops.*. Where is the comment or documentation explaining the reasoning behind this?
The rasops code has clearly grown in complexity over time. Some of the advanced C programming techniques, such as function pointers burried deep within several layers of structures and the use of the C pre-processor to expand function names and share common code between similar functions (*), were probably more manageable without proper documentation when the codebase was smaller. However, at this point the lack of documentation really serves to waste developer time repeatedly re-reading existing code to fully understand it's interactions with the rest of the system, and ultimately slows down the development process.
(*) This can be seen, for example, in rasops1.c and rasops4.c. Both files import rasops_bitops.h, which defines various functions but names them using the NAME macro. So when it's imported from rasops1.c, we get rasops1_copycols, rasops1_erasecols, etc. When it's imported from rasops4.c, we get almost identical code as rasops4_copycols, rasops4_erasecols, and so on. This is a valid technique, and to be fair it is mentioned in a comment at the bottom of both rasops1.c and rasops4.c, but for anybody who is not familiar with the codebase a comment at the top near the function prototypes would be more easily noticed.
For the task in hand, we have little choice but to work with what we're given, (and try to improve it along the way). But take the experience of working with the rasops code as a clear example of why new projects, (and new code added to existing codebases), should be written with care and attention to good organisation and documentation practices.
* Once code starts to grow in complexity, ensure that it is either self-documenting with functions and structures that have a clearly defined and unambiguous purpose, or write and maintain documentation for it, (preferably do both).
* If adding a new feature makes the existing code harder to follow, take the time to re-structure the existing code, (the rotation code is a good case in point here).
Writing our own rasops function
We'll call our new rasops function 'screendump'. The code to call it from the ioctl handler in wsdisplay.c is very simple and almost the same as what we have in the example above:
case WSDISPLAYIO_SCREENDUMP_PIXELS:
{
int result;
struct wsemul_abortstate abortstate;
abortstate.skip=0;
WSEMULOP(result, scr->scr_dconf, &abortstate, screendump, (scr->scr_dconf->emulcookie, ((struct wsdisplay_screendump_pixels *)data)));
return (0);
}
The only parameters we need to pass are 'emulcookie', which represents the specific display device we want to talking to, and the whole wsdisplay_screendump_pixels structure that we will eventually pass back to userland.
Important note
The cookie 'emulcookie', is not the same cookie that is eventually passed to the low-level rasops routines. This is a good example of where using similar function names and not adequately documenting complex chains of function calls can easily cause confusion for somebody trying to understand the code simply by reading the source.
Recall our trivial example above, where we called the putchar routine:
WSEMULOP(result, scr->scr_dconf, &abortstate, putchar, (scr->scr_dconf->emulcookie, 10, 10, '!', 0x01020000));
Searching the source code, it's easy to find where the low-level putchar functions are defined. In fact, there are separate functions for each bit-depth, (which should not come as a surprise), so for example the code to implement putchar on 32bpp displays is in rasops32.c.
Reading the function definition for rasops32_putchar, we can see that it accepts a cookie, a row value, a column value, a character, and an attribute. These might seem like exactly the same values that we are passing to the WSEMULOP macro, but closer inspection shows us that the low-level putchar function is expecting the cookie to be a pointer to a struct rasops_info, whereas the cookie supplied to WSEMULOP varies depending on what emulation we are using. In most cases this will be the vt100 emulation code, so the cookie will be a struct wsemul_vt100_emuldata. As long as it has a struct wsdisplay_emulops * emulops, WSEMULOP doesn't care, but it's certainly never a struct rasops_info.
Since when we call from within the ioctl handler we don't have an emulation-specific structure to work with, we pass the struct wsscreen_internal that we have as scr_dconf.
Fun fact
The variable 'edp' is frequently used throughout the wscons code as a pointer to a structure that is specific to the particular emulation in use.
Reading the source code doesn't tell us what 'edp' actually stands for, since it's not documented anywhere.
We can only guess at it's true meaning, but it seems reasonable to assume that it actually stands for Emulation Data Pointer.
Next, we need to add a function pointer for our new function to struct wsdisplay_emulops, defined in wsdisplayvar.h:
int (*screendump)(void *c, void *data);
Note: Although we could add this extra function pointer anywhere within the definition of struct wsdisplay_emulops, some of the device drivers, (such as the vga driver), have their wsdisplay_emulops structure hard-coded as a constant. As a result, any re-ordering of the existing entries would break those drivers. To avoid this, add the new entry to the end of the current list, after unpack_attr.
At this point we can finally move to the rasops code itself.
The generic rasops functions, in other words those which are not specific to a particular bit-depth, are in rasops.c. This is where we will put our screendump function, as although the format of the data will obviously be different depending on the bit-depth, actually reading out the entire content of the framebuffer will can be done in exactly the same way regardless.
If we were going to do any post-processing of the pixel data within the new rasops screendump function itself, we might want to write separate functions for each bit-depth. However, this functionality is much better done in userspace - we want to do as little as possible within the kernel.
The following code is what we are going to add to rasops.c:
/*
* Screendump
*/
int
rasops_screendump(void * cookie, void * data)
{
#define screendump ((struct wsdisplay_screendump_pixels *)data)
#define FB_SIZE (ri->ri_stride * ri->ri_height)
struct rasops_info * ri;
unsigned char * rp;
ri=(struct rasops_info *) cookie;
rp=(unsigned char *)(ri->ri_bits);
/* Fill in various display parameters to pass back to wsdisplay, and eventually to userland. */
screendump->width=ri->ri_width;
screendump->height=ri->ri_height;
screendump->stride=ri->ri_stride;
screendump->depth=ri->ri_depth;
/* Copy the raw framebuffer data to userland. */
copyout (rp, screendump->data, FB_SIZE);
return (0);
}
Obviously we should add a function prototype near the top of the file along with the existing ones, underneath the 'generic functions' comment:
int rasops_screendump(void *, void *);
The new function should be fairly self-explanatory. We cast the supplied cookie to a struct rasops_info, from which we can obtain the width, height, stride, and depth of the display. These values are assigned to the corresponding members of the struct wsdisplay_screendump_pixels that will soon be passed back to the calling program.
Finally, we use the kernel copyout function to copy the pixel data directly from the framebuffer to the userland memory address supplied in the screendump->data pointer.
Fun fact
Calculating the screen size
You might have expected that we'd calculate the size of the display in bytes by using width * height * depth, (obviously with the necessary conversion to ensure that depth is expressed in bytes rather than bits). Instead we're using this 'stride' parameter, so what exactly is 'stride', and how does it differ from the width?
For some display sizes, the stride will indeed be the same as width * depth. This is typically when the width in pixels is a whole multiple of 32 or some other value required for memory alignment. So a 1920 × 1080 display at 32bpp will usually have a stride value of 1920 × 4 = 7680, and each line of pixel data will immediately follow the previous one.
Not so for resolutions such as 1366 × 768. In this case, at 32bpp, a single line of pixels would be 5464 bytes. This doesn't align to a 128-byte boundary, (32 pixels * 32 bpp = 128 bytes), so we leave some unused bytes at the end of the line and start the next line of pixel data after, say, 5504 bytes. In effect, the screen has a width of 1376 pixels, (5504 bytes / 32 bpp = 1376 pixels), but only the first 1366 are displayed. Visually, if we create a screendump from the full dataset, we will see an unused band of 10 pixels down the right hand side. This might be filled with 0x00 bytes, or might contain random data depending on the behaviour of the underlying graphics hardware.
Our function dumps the whole contents of the framebuffer memory, and passes both the stride, (actual width), and width, (intended visual width), back to userland. The decision as to whether to crop any extraneous pixels from the final output can then be made from the calling program.
On a system where the graphics hardware operates in different modes, each with a different memory layout for the framebuffer, it's plausible that these extraneous pixels might contain old data preserved from the contents of the display when it was in a different mode in which those locations were actually active pixels. Potentially, on a multi-user system this could be considered a security issue since it could in theory leak private data.
If we wanted to avoid this, we could modify the new kernel code to first copy the framebuffer contents to a kernel-allocated memory buffer, then overwrite the non-active pixels with 0x00 bytes before copying that buffer back to userspace.
Setting the function pointers and additional vcons consider­ations
Now that we have our new function in place, we're almost ready to go.
The initial assignment of the function pointers is performed in rasops_reconfig, which is called from rasops_init.
It might seem that we just need to set the new function pointer to point to rasops_screendump, by placing this assignment underneath the 'Fill in defaults' comment:
ri->ri_ops.screendump = rasops_screendump;
But this is not enough. Now we have to deal with a strange complexity of the rasops code, which can be seen if we continue reading the source for rasops_init past it's call to rasops_reconfig.
The pointers in ri->ri_ops are copied elsewhere, and then if the RI_VCONS flag is set then the original values are replaced with calls to similarly named functions, which differ by having 'vcons' in their names. So rasops_copycols becomes rasops_vcons_copycols, and so on. There is also a further check for the RI_WRONLY flag, so that further replacement routines can be substituted that avoid writing to the framebuffer.
Our screendump routine doesn't really need any additional code to be vcons-aware, and it obviously wouldn't make much sense to try to create a version that doesn't read from the framebuffer, so we don't need any of this added complexity for our function to work. However the easiest way to ensure that our new code works as expected is to provide a rasops_vcons_screendump wrapper function that just calls rasops_screendump, and fill in the extra function pointers as required.
The rasops_vcons_screendump function does nothing other than to call rasops_screendump via a new rs_ri->ri_screendump function pointer:
int
rasops_vcons_screendump(void * cookie, void * data)
{
struct rasops_screen *scr = cookie;
return scr->rs_ri->ri_screendump(scr->rs_ri, data);
}
Of course, we should add the function prototype as well, along with the existing ones for the other vcons functions:
int rasops_vcons_screendump(void *, void *);
We also need to add that new function pointer to the definition of struct rasops_info in rasops.h:
int (*ri_screendump)(void *, void *);
Next we modify rasops_init to copy the original pointer:
ri->ri_screendump = ri->ri_ops.screendump;
Finally, we also modify rasops_init to re-assign ri->ri_ops.screendump to the vcons version:
ri->ri_ops.screendump = rasops_vcons_screendump;
At this point, we've finished the kernel code. After re-compiling and re-booting into the kernel we can continue on to the userspace program that calls it.
Userland code
Here is a sample program that calls WSDISPLAY_SCREENDUMP_PIXELS, and converts the data it receives from the common 32bpp BGR0 format to a 24-bit RGB ppm file:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/ioctl.h>
#include <sys/time.h>
#include <dev/wscons/wsconsio.h>
#include <fcntl.h>
int main()
{
int fd_out;
int header_len;
int pos_in;
int pos_out;
int res;
int x;
int y;
struct wsdisplay_screendump_pixels screendump;
struct wsdisplay_fbinfo dispinfo;
unsigned char * header;
unsigned char * file_out;
unsigned char * processed_pixel_data;
/* Call WSDISPLAYIO_GINFO to get the size of the framebuffer. */
res=ioctl (STDIN_FILENO, WSDISPLAYIO_GINFO, &dispinfo);
if (res == -1) {
printf ("Ioctl WSDISPLAYIO_GINFO not supported for this device\n");
return (1);
}
/* Allocate the correct amount of memory for the screendump. */
screendump.data=malloc(dispinfo.stride * dispinfo.height);
res=ioctl (STDIN_FILENO, WSDISPLAYIO_SCREENDUMP_PIXELS, &screendump);
if (res == -1) {
printf ("Ioctl WSDISPLAYIO_SCREENDUMP_PIXELS not supported for this device\n");
return (1);
}
printf ("Ioctl WSDISPLAYIO_GINFO returned %d\nwidth %d, height %d, stride %d, depth %d\n", res, dispinfo.width, dispinfo.height, dispinfo.stride, dispinfo.depth);
printf ("Ioctl WSDISPLAYIO_SCREENDUMP_PIXELS returned %d\nwidth %d, height %d, stride %d, depth %d\n", res, screendump.width, screendump.height, screendump.stride, screendump.depth);
if (screendump.depth != 32) {
printf ("Only 32bpp input data is supported.\n");
return (1);
}
/*
* Simple conversion to 24-bit RGB
*
* This assumes that the raw screendump data is in BGR0 format.
*/
processed_pixel_data=malloc(screendump.width * screendump.height * 4);
pos_out=0;
pos_in=0;
for (y=0; y<screendump.height; y++) {
for (x=0; x<screendump.stride/4; x++) {
if (x<(screendump.width)) {
*(processed_pixel_data+pos_out++)=*(screendump.data+pos_in+2);
*(processed_pixel_data+pos_out++)=*(screendump.data+pos_in+1);
*(processed_pixel_data+pos_out++)=*(screendump.data+pos_in);
}
pos_in+=4;
}
}
header=malloc(1024);
header_len=sprintf (header, "P6\n%d %d\n255\n", screendump.width, screendump.height);
fd_out=open ("rgb_screendump.ppm", O_CREAT | O_WRONLY, 0644);
write (fd_out, header, header_len);
write (fd_out, processed_pixel_data, pos_out);
close (fd_out);
return (0);
}
All we do here is to cycle over the raw framebuffer data one pixel at a time, and re-arrange each set of four bytes into a three byte RGB value.
For each line, once we reach the width as specified in screendump.width, further data is ignored until the end of the line as defined by the value of screendump.stride.
The resulting ppm file is written to rgb_screendump.ppm in the current directory, and can be viewed with any standard image viewer that supports the ppm format.
If the colors are inverted, or one of the red, green, or blue channels is missing altogether, then the input is not in BGR0 format. In this case, the code that re-arranges the bytes will need to be modified.
Compatibility with other graphics drivers
The new ioctl that we've added to wsdisplay.c can potentially be called even when we are not using graphics hardware that uses the rasops routines. A few different graphics drivers exist where this is the case, most of them for uncommon display devices, but one exception that we can easily use for testing purposes is the plain vga text mode driver.
With our code in it's current form, if we try to access the WSDISPLAYIO_SCREENDUMP_PIXELS ioctl whilst running with the vga driver, we'll cause a kernel panic:
attempt to execute user address 0x0 in supervisor mode
kernel: page fault trap, code=0
This shows that we've tried to execute code at address 0x0. This shouldn't come as much of a surprise, since we added a new function pointer to struct wsdisplay_emulops and this hasn't been initialized by the code in vga.c, (which initializes vga_emulops as a constant with the function pointers hard-coded).
Since the vga driver declares this struct wsdisplay_emulops in global scope, (in other words, outside of any function), the C standard guarantees that it's elements will be initialized to NULL.
However other device drivers don't necessarily do this, and instead declare their struct wsdisplay_emulops in a similar way to the code in rasops.c.
The upshot of this is that we can't simply change our ioctl handling code to check for a NULL function pointer:
case WSDISPLAYIO_SCREENDUMP_PIXELS:
{
int result;
struct wsemul_abortstate abortstate;
if (scr->scr_dconf->emulops->screendump == NULL) {
return (EINVAL);
}
abortstate.skip=0;
WSEMULOP(result, scr->scr_dconf, &abortstate, screendump, (scr->scr_dconf->emulcookie, ((struct wsdisplay_screendump_pixels *)data)));
return (0);
}
This would certainly work to catch the case of accessing the function whilst using the vga driver, but it's not a general solution.
The definitive and most compatible way to fix this, of course, is to add a screendump function of some sort to each of the display drivers, even if it just returns EINVAL without providing any framebuffer data.
We can easily do this for the vga driver:
int vga_screendump(void * cookie, void * unused)
{
return (EINVAL);
}
Adding this function, along with it's function prototype and an entry for it in the declaration for vga_emulops will cause our call to WSEMULOP in the ioctl handler to return EINVAL in the result variable.
Now we just need to change our code to return this value instead of always returning zero:
case WSDISPLAYIO_SCREENDUMP_PIXELS:
{
int result;
struct wsemul_abortstate abortstate;
abortstate.skip=0;
WSEMULOP(result, scr->scr_dconf, &abortstate, screendump, (scr->scr_dconf->emulcookie, ((struct wsdisplay_screendump_pixels *)data)));
return (result);
}
Of course, since our userland screen dumping program always checks for a successful call to WSDISPLAYIO_GINFO before calling WSDISPLAYIO_SCREENDUMP_PIXELS, running it unmodified even with the vga driver in use will not trigger a kernel panic anyway.
We could even write a routine to produce graphical output from the contents of the vga text display by rendering each character manually ourselves, even though the underlying hardware was not operating in a graphical mode.
Downloads
A tar archive containing the kernel patches and userland utilities can be downloaded here.
All of the files within the tar archives are signed using our signify key.
Assuming that you have the tar archive and our public key in /root, you can apply the kernel patch as follows:
# cd /root
# tar -xf console_screendumps.tar
# cd /usr/src/
# signify -Vep /root/exoticsilicon.pub -x /root/screendump_patches/screendump_kernel_patches.sig -m - | patch
At this point you can re-compile and re-install the kernel in the normal way.
You should also copy the modified wsconsio.h header file to the correct place in the /usr/include/ hierarchy:
# cp -p /usr/src/sys/dev/wscons/wsconsio.h /usr/include/dev/wscons/
If you don't have our signify key, (why not?), you can also apply the patch directly with the patch utility:
# patch < /root/screendump_patches/screendump_kernel_patches.sig
The userland utilities can be compiled with no special options, after verifying their checksums:
# cd /root/screendump_utilities/
# signify -Cp /root/exoticsilicon.pub -x checksums.sig
# cc -o screendump_graphical screendump_graphical.c
# cc -o screendump_text screendump_text.c
Note that the screendump utilities can be run as a regular user, and do not require root permissions to perform screendumps of the display associated with standard input. Performing screendumps against other devices may require additional access permissions.
Summary
Today we've seen how to implement two different types of screendumping facility in OpenBSD by adding new ioctls to the wscons subsystem along with supporting code in the kernel and userland programs.