ExoticSilicon.com - adding read-only volume support to softraid

Introduction

Removable storage with hardware write protection is a pretty useful safeguard against accidental overwrites and has saved me on several occasions.

At the same time, it's also fairly useful and indeed good industry practice to encrypt data that's being carried around on small devices that are easily lost.

Unfortunately, as of OpenBSD 7.7-release, trying to attach a softraid crypto volume that resides on a read only device fails:

# bioctl -c C -l sd5d softraid0

softraid0: invalid metadata format

Unexpected and somewhat unhelpful error message

As well as being desirable to attach such read-only volumes, the error message displayed is unfortunate as it implies to the user that the metadata on disk is corrupted when it actually isn't.

Locating the problem

To diagnose this issue, we need to look at sr_meta_probe(), which can be found in sys/dev/softraid.c.

Here we can see that the call to VOP_OPEN() always tries to open the vnode read/write.

On a physical device which reports itself as read-only, this call to VOP_OPEN() will return EACCES. However it's also possible to see EROFS if we try to access, for example, a vnd device which is backed by a file on a read-only filesystem.

The first step towards making softraid volumes work read-only is to patch sr_meta_probe() to catch these situations and re-try calling VOP_OPEN() with just FREAD permissions:

if (error == EACCES || error == EROFS ) {

printf ("retrying read-only\n");

error = VOP_OPEN(vn, FREAD, NOCRED, curproc);

printf ("got %d\n", error);

}

Re-trying the call to VOP_OPEN() with just FREAD, if the existing call fails.

Success!

This allows the volume to attach:

# bioctl -c C -l sd5d softraid0

softraid0: CRYPTO volume attached as sd6

Promising results, an initial success!

Except...

As might be expected, we see errors from the kernel when the softraid subsystem tries to update the metadata on the device:

sd5(umass0:1:0): Check Condition (error 0x70) on opcode 0x2a

SENSE KEY: Write Protected

ASC/ASCQ: Write Protected

softraid0: I/O error 30 on dev 0xx453 at block 16

softraid0: could not write metadata to sd5d

Metadata on the underlying device can't be updated, since it's read-only.

Nevertheless...

However the inability to update the metadata doesn't stop us making use of the softraid volume for read only access.

# mount -oro /dev/sd6d /mnt # Succeeds

# umount /mnt # Succeeds

At least the essential functionality is already working.

Shortcomings and missing functionality

Unfortunately, since the attached softraid volume doesn't report it's read-only capability, the system allows attempts to mount it read-write.

This initially succeeds, but immediately causes a flood of I/O errors.

To make things worse, on the test machine the volume could not be unmounted even when using the -f argument to umount.

SCSI device read-only flag reporting mechanism

Devices attaching as sd devices can report themselves as read-only. The sd driver discovers this by reading one page of the device's mode page data, (which for those readers unfamiliar with the SCSI protocols, is basically a table of drive capabilities, some of which can be changed).

Interestingly, the read only flag isn't actually part of the mode page data itself, but is instead returned as a bit in the device specific byte of the header for the mode page data.

Fun fact!

Visualising the flag from userland

We can see this from userland with a slightly modified version of the /sbin/scsi utility.

The existing code doesn't display the mode page header at all, and in fact won't even open a read-only device that's specified with the -f argument.

To do this we change the call to scsi_open() to use O_RDONLY if O_RDRW fails, and add the following printf() call to mode_edit():

printf ("*** %02x %02x %02x(r/o) %02x\n", mh->mdl, mh->medium_type, mh->dev_spec_par, mh->bdl);

Ready to apply patches are linked to at the end of this article for readers who don't want to write their own.

Now, querying the physical hardware device, we can see the top bit of the device specific header byte reflecting it's read-only status:

With the device write enabled

# /usr/src/sbin/scsi/scsi -f /dev/rsd5c -m 0

*** 0b 00 00(r/o) 08

00 00 00 00 00 00 00 00

00 00 00

With the device write protected

# /usr/src/sbin/scsi/scsi -f /dev/rsd5c -m 0

*** 0b 00 80(r/o) 08

00 00 00 00 00 00 00 00

00 00 00

When the device is configured as read-only, bit 7 of the third header byte is set.

Note: this simple demonstration with /sbin/scsi assumes that the device actually supports reading the mode page data in the first place, otherwise /sbin/scsi will just error out with a message similar to this:

SCIOCCOMMAND ioctl: Command accepted.

return status 4 (Unknown return status)Command out (6 of 6):

1a 00 00 00 ff 00

Data in (0 of 255):

No sense sent.

Querying a device that doesn't support returning mode page data

Fun fact!

Long unfixed, (presumably undiscovered), bugs

The /sbin/scsi utility has various bugs in it's code to display the unformatted hex data.

In the examples above it displayed eleven bytes of data after the header, whereas the first header byte 0x0b actually indicates the amount of data to follow including the remaining three header bytes. Therefore, only eight bytes of the following data are relevant.

Additionally, although the code skips printing the block descriptors, (which, if present, immediately follow the header), the total count of bytes printed is not reduced by the number skipped.

Interestingly, these bugs have been present in the code since the initial revision imported to the OpenBSD CVS tree in 1996.

Earlier versions can be found in the FreeBSD project's historical codebase, (the /sbin/scsi utility was removed from FreeBSD some time ago). The earliest version of the /sbin/scsi code that I was able to find which includes code to parse the -m option is revision 1.3 from 1995-04-17. This version correctly skips the block descriptors, and stores the correct length value read from the second byte of the mode page data itself in variable page_length.

Unfortunately, the value of this variable is never used, and instead the hex printing loop uses the value of mode_data_length as it's loop test condition, so this version of the code already has the bugs described above.

Revision 1.6 from 1995-05-01 introduced the code as it is now, with the assignment to page_length removed, but with both the bugs described above still present.

Patches to address these issues are included in the patchset at the bottom of this article.

Understanding the sd driver

Knowing how the read-only status of a device is reported helps to demystify the relevant part of the source code for the sd driver, (and in fact the st driver which does almost the same thing). The code queries mode page zero, which is for non-standardised vendor specific use anyway, and promptly ignores the data returned if indeed the opcode even succeeds. Nevertheless, this process is enough to get the required header bytes.

Within the sd driver code, sdattach() calls sd_get_parms(), which calls through to scsi_do_mode_sense() in scsi_base.c, which in turn calls scsi_mode_sense() where ultimately the sense data is obtained by putting the opcode MODE_SENSE or MODE_SENSE_BIG in to scsi_xs_sync() and scsi_xs_put().

Returning to sd_get_parms(), if the relevant bit is set then the SDEV_READONLY flag is also set in the corresponding link->flags.

Looking back at sdattach(), (also in sys/scsi/sd.c), we can see that readonly is then printed in the device attach console output, if link->flags has SDEV_READONLY set.

Implementation note:

For those readers who are familiar with SCSI opcodes, as might be expected, MODE_SENSE and MODE_SENSE_BIG translate to the 6-byte and 10-byte versions of the MODE SENSE opcode respectively.

Implementing the same functionality on the virtual device

Obviously the write-protected hardware device on which our softraid chunk resides already does this reporting. What we want to do now is arrange for our newly created virtual sd device to do the same thing.

To signal the read-only status of the virtual device, we need to make it supply an appropriate response to the mode sense command sent by the sd driver when it queries mode page zero. We don't actually need to return any values, but we do need to send back the header, (which will also indicate that there are zero data values in the reply).

SCSI opcodes sent to softraid devices are handled by sr_scsi_cmd(). To avoid confusion here, note that we are now looking at code which implements the scsi target side of the communication, rather than the initiator part of the exchange, (in other words, this is the function that implements the part of the SCSI protocol that would traditionally be done by the firmware of a physical disk drive, providing the response data back to the host adaptor).

We can add stubs to the switch statement in sr_scsi_cmd() to visualise these opcodes being sent by sd_get_parms().

Note that scsi_do_mode_sense() always tries MODE_SENSE before trying MODE_SENSE_BIG, so as long as we return a valid response, (header), for MODE_SENSE, we can basically ignore MODE_SENSE_BIG.

case MODE_SENSE:

printf ("Got MODE_SENSE opcode for page %d\n", (xs->cmd.bytes[1] & 0x3f));

goto stuffup;

case MODE_SENSE_BIG:

printf ("Got MODE_SENSE_BIG opcode for page %d\n", (xs->cmd.bytes[1] & 0x3f));

goto stuffup;

Stub functions allow us to visualise the use of these SCSI opcodes.

Note that the mode page number being requested is in the lower six bits.

Before we add the code to the ioctl handler, though, we need to define a new capabilities flag in softraidvar.h:

#define SR_CAP_READONLY 0x00000020 /* Read-only flag */

The new flag defined above will be set in the following code if we open any of the underlying devices for the softraid volume read-only.

Of course, the choice of 0x00000020 as the bit value is arbitrary.

And we also need to add a line to our new code in sr_meta_probe() to set this flag, if we do indeed open the device as read-only:

if (error == EACCES || error == EROFS ) {

sd->sd_capabilities |= SR_CAP_READONLY;

error = VOP_OPEN(vn, FREAD, NOCRED, curproc);

}

The diagnostic printf() calls added previously can also be removed at this point.

Now that we have SR_CAP_READONLY set appropriately, our ioctl handler for MODE_SENSE becomes fairly simple:

case MODE_SENSE:

* Ignore requests for mode pages other than page zero or 0x3f, (all pages).

if ((xs->cmd.bytes[1] & 0x3f) != 0 && (xs->cmd.bytes[1] & 0x3f) != 0x3f) {

goto stuffup;

}

bzero (&mode_sense_header, 4);

* Set the data length header field, which excludes the byte already sent.

mode_sense_header[0]=0x03;

* If the SR_CAP_READONLY flag was set then set bit 7 of the third header byte.

if (sd->sd_capabilities & SR_CAP_READONLY) {

mode_sense_header[2]=0x80;

}

scsi_copy_internal_data(xs, &mode_sense_header, 4);

* Set the full overall header length, which is four bytes and not just three

* as set above.

xs->datalen = 4;

goto complete;

A new ioctl handler!

With mode_sense_header[] also defined at the start of sr_scsi_cmd() as a local array:

unsigned char mode_sense_header[4];

Function local array definition.

With these kernel modifications in place, our softraid sd device correctly reports itself as read-only and is usable in read-only mode.

Attempts to mount it read-write or perform any write operation directly to the device will be denied by the kernel.

Top tip!

Read-only volumes in /etc/fstab

If a read-only volume, (any read-only volume, not just a softraid one), is mounted at boot time, then it needs to have the fs_passno field of fstab set to 0.

This is because the preen mode of fsck errors out when it detects a lack of write access to the device. Since a permanently read-only filesystem would be expected to be clean anyway, this shouldn't really pose a problem.

Userland utility support for read-only mode

Some changes are required to userland utilities in order to make full use of the new kernel features.

/sbin/bioctl fails to remove a read-only sd device:

The code attempts to open /dev/bio when the previous call to opendev() fails, regardless of the reason for the failure.

Since the opendev() call is hardcoded to open the device with O_RDWR, this obviously fails when it's read-only, and leads to the misleading error message ‘Can't locate sdx device via /dev/bio’.

# bioctl -d sd6

bioctl: Can't locate sd6 device via /dev/bio

Another somewhat mis-leading error message. In fact, the problem is nothing to do with /dev/bio.

The solution is just to check for func == BIO_DELETERAID or func == BIO_INQ and call opendev() with O_RDONLY instead of O_RDWR in this case.

(func == BIOC_DELETERAID || func == BIOC_INQ ? O_RDONLY : O_RDWR)

Some functions don't actually require write access anyway.

/sbin/scsi won't open read-only devices

A similar issue exists in the /sbin/scsi utility, which returns the following error when trying to open a read-only device:

# scsi -f /dev/sd6c

scsi: unable to open device /dev/sd6c: Permission denied

The /sbin/scsi utility always tries to open the device specified with the -f option as read-write.

This time the solution is even easier. If the call to scsi_open() fails with O_RDWR, we immediately retry with O_RDONLY:

if ((fd = scsi_open(optarg, O_RDONLY)) < 0)

Just one additional conditional statement is required to re-try the operation as read-only.

Other RAID disciplines

So far we've only considered the softraid crypto discipline, but the changes to the softraid code are to the common functions sr_meta_probe(), and sr_scsi_cmd(), which are shared with other disciplines and the read-only functionality works there too.

In the case of RAID disciplines with more than one device, (such as a mirror set), if any one of the underlying devices is opened read-only then the assembled softraid sd device will be flagged as read-only, even if other devices in the array are writable.

Downloads

A patchset is available for download which applies to OpenBSD 7.7-release.

The patchset is signed with our signify key, and includes the kernel patch, as well as patches for /sbin/scsi and /sbin/bioctl.

To apply the patchset, first place our signify key in /etc/signify if you don't already have it:

# ftp -o /etc/signify/exoticsilicon.pub https://research.exoticsilicon.com/local_patchsets/exoticsilicon.pub

Download our key

Then download the patchset:

# cd /root ; ftp https://research.exoticsilicon.com/downloads/softraid_read_only_patchset_7.7.sig

Download the patchset

Next, having ensured that you have the OpenBSD 7.7-release source tree in /usr/src, apply the patches to the base of the source tree:

# cd /usr/src/

# signify -Vep /etc/signify/exoticsilicon.pub -x /root/softraid_read_only_patchset_7.7.sig -m - | patch

Apply the patchset

Lastly, recompile the kernel and userland utilities:

# cd /usr/src/sys/arch/`machine`/conf/

# config GENERIC.MP

# cd /usr/src/sys/arch/`machine`/compile/GENERIC.MP/

# make clean && make && make install

# cd /usr/src/sbin/scsi

# make && make install

# cd /usr/src/sbin/bioctl

# make && make install

Re-compile kernel and userland

The kernel compilation instructions given above are sufficient for a typical generic system, but for a more complete guide to kernel compiling on OpenBSD please refer to our our custom kernel compilation write-up.