“Building ports from source, tips, tricks, and techniques with dpb”
Jay walks us through setting up Distributed Ports Builder, (dpb), on OpenBSD
Building ports from source - introduction
In the first installment of our new series with Jay Eptinxa, we'll look at compiling software in the ports tree from source, and making our own binary packages.
After a basic introduction to getting dpb set up and working, Jay will show us some useful techniques to make our bulk building run as smoothly and efficiently as possible.
THIS IS ONE OF A SERIES OF ARTICLES - CHECK OUT THE INDEX.
The Exotic Silicon website is available in ten themes, but you haven't chosen one yet!
Many readers might be asking what if any advantages there are to building ports from source, when binary packages are available for the popular architectures.
Even the official OpenBSD documentation recommends the use of the pre-built binary packages over building from source.
Nevertheless, there are definite advantages to setting up and using dpb to build your own packages from source...
Reasons we might want to build ports from source
You want to edit the source code and make your own changes to it
You want to apply patches that are not already included in the OpenBSD port, or use different compiler options
You want to bulid packages for an architecture that doesn't have pre-built binary packages available
You want to learn and understand the ports building process
As well as these common reasons, there are also numerous other instances where building from source is useful.
Some ports have not been updated for a very long time, three examples being archivers/bzip2, sysutils/dvd+rw-tools, and sysutils/cdrtools. Using the pre-built binary packages, you'd have needed to download a new package for each successive OpenBSD release, whereas working with the source code, you would have already had many of the source files to hand.
Why use dpb?
But why dpb?
If you've set up the ports tree in /usr/ports, and have a working internet connection, then you can build any particular port by doing a simple make in it's directory, something like this:
# cd /usr/ports/archivers/bzip2
# make install
It seems so easy this way, but...
Unfortunately, whilst this method is simple and might appeal to new users without much experience of OpenBSD, it does has some severe drawbacks:
Drawbacks of just running make in the ports tree
By default, the entire build process runs as root
Each port's dependencies will be built sequentially
Only one CPU core will be used
The sequential building of each package combined with the use of only one CPU core often results in very long build times for complex ports with a lot of dependencies, time during which the CPU remains mostly idle if the machine isn't busy with other tasks.
It is possible to set up this method of building ports such that it runs as an unprivileged user, however if you're going to go to the trouble of doing this, you might just as well set up dpb anyway.
An interesting anecdote for the curious: many years ago it was also possible to use systrace to reduce the risk of a malicious port doing something undesirable to your system. This greatly increased build times, which served as a fairly strong discouragement to it's use. Systrace was finally removed from OpenBSD during the 6.0 development cycle.
A few ports might successfully build using multiple cpu cores if invoked with a suitable -j parameter. Many will not.
Obviously there has to be a better way to build software from source, and of course there is, dpb. With dpb we get the following benefits:
The wonders of dpb
Parallel building of multiple ports using several CPU cores
Building process automatically performed as an unprivileged user
Optional privilege separation in the form of a chroot
Build logs kept in a central location
It's easier to fetch all required source distfiles first, for a later off-line build
Whilst the name, distributed ports builder, might suggest that it's only intended or beneficial for those users who are running a compile farm with multiple machines on a LAN, in fact, dpb is a much better way to do just about any work involving the ports tree.
If you intend to do any serious work with ports, the initial investment in time and effort involved in learning to use dpb will almost certainly be outweighed by the added convenience and functionality gained.
So let's make a start setting it up!
Prerequisites for setting up dpb - extracting the ports tree
I'm going to assume that you already have the ports tree extracted into /usr/ports. If not, you probably want to do something like this:
# cd /usr/ports
# tar -xvzf /install_path/ports.tar.gz
Extracting the ports tree
Of course, you should have also already checked the integrity of the source archive with signify before un-tarring it:
Checking the integrity of the source archive using signify
Due to limits imposed by the version of tar in the base system on the length of file paths, some of the files will be missing if you use this method of populating /usr/ports. Unless you actually want to build the few affected ports, this probably doesn't matter. If you do want to be sure of having a full copy of the ports tree, you can use cvs to fetch the missing files. If you're going to do this, note that it's likely still quicker to obtain most of the files from the tar archive and then use cvs just to update those that are missing, rather than do a cvs checkout of the full ports tree.
Since we're going to set up a chroot environment for the actual building processes, /usr/ports is basically only going to act as an initial source for populating the chroot, (and re-populating it if we ever have problems and want to start over). For this reason, /usr/ports doesn't need to be on it's own partition, and we don't need to extract the tar archive using the -p option to preserve the file attributes, (although doing so won't do any harm).
All of these instructions assume that you're using a -release version of OpenBSD. If you're running a -current system then you'll definitely want to keep your ports tree up to date using CVS and re-populate the chroot at regular intervals.
Prerequisites for setting up dpb - partitions for the chroot
Whilst /usr/ports doesn't need it's own partition, it is beneficial to create one or more partitions for the chroot that dpb will run in.
Within the chroot environment, two important paths are /usr/ports/distfiles, which is where the actual sourcecode archives will be stored, and /usr/ports/pobj, which is where they will be extracted and where the compilation will actually take place.
Since the /dev/ directory within the chroot needs to contain device special files, if you decide to create just a single partition for the chroot, it will need to be mounted with the dev option, (as opposed to the default nodev). It will also probably require the wxallowed flag, unless you happen to be compiling exclusively well-behaved ports which don't need it.
Alternatively, we can create separate partitions for the distfiles and pobj directories. Doing so has various benefits:
Advantages of separate partitions for distfiles/ and pobj/
We can mount the chrooted /usr/ports/distfiles and /usr/ports/pobj using the nodev flag, using dev only on the main chroot partition
We can restrict the use of the wxallowed flag to the pobj partition
We can optionally mount the pobj partition using the async flag for better performance
Alternatively, if we have sufficient physical RAM, we can put pobj on a ramdisk filesystem for even better performance
We can mount the distfiles partition read-only for times when we don't expect to, and don't want to, be downloading additional source files
We can re-format and re-populate the main chroot partition without having to backup and restore, or re-download the distfiles
It's easier to increase the size of the distfiles or pobj partitions at a later date, by deleting and re-creating them
The amount of disk space required for each partition obviously depends on exactly which ports you want to build. The amount of space required on pobj will also depend to a degree on how many ports you intend to build in parallel, since this requires more source files to be extracted at the same time.
Here is a table summarising the partitioning requirements, with some rough ballpark values for the sizes:
User and group ownership
3 - 6 Gb
The main chroot partition
2 - 25 Gb
The source distribution files
5 - 10 Gb
wxallowed , async 
The actual build directory
 The wxallowed flag is not required for all ports, however some commonly used ones do require it. If you are only building a subset of ports which do not, then you can successfully mount pobj without the wxallowed flag.
 Since the pobj directory is effectively just a temporary working directory which can be easily re-populated if it's erased, the main risks of using the async flag, namely data corruption in the event of power-loss or a kernel panic, are somewhat mitigated.
Once again, your own requirements for the partition sizes might be wildly different to those noted in the table. Since the main /proot partition is the easiest to size, and the contents of the pobj directory can be erased between builds, by placing the partitions on the disk in the order that they appear in the table above, it remains quite easy to re-partition and increase the space allocated to the distfiles and pobj partitions at a later date if this ever becomes necessary.
If you have 32 Gb or more physical RAM in the build machine, you might want to create an mfs-backed ramdisk partition for the pobj/ directory:
# mount_mfs -s 12g swap /proot/usr/ports/pobj
Mounting an mfs ramdisk on the pobj/ directory
In our tests, performance was noticeably better using a ram-based mfs filesystem, even when compared to a reasonably fast SSD. However, this is probably going to be highly dependent on your specific hardware, so if performance is important to you then doing your own benchmarks is key. Even if you don't have enough physical RAM to compile large ports in this way, it remains an option for lighter ports building sessions, so you might decide to create a large disk-based pobj partition, then mount a smaller mfs filesystem over it when you know that you are only going to be compiling a few small ports. Once again, experimentation is the key to getting best performance.
In the table above and in the following examples, we've used /proot as the location for the root of the choot. However you can choose just about any location for this.
With the relevant partitions created, newfs'ed, added to /etc/fstab with appropriate flags, and mounted, we're ready to move on to the next step.
Populating the chroot directory
In contrast to the normal approach of populating a chroot environment, in which you ideally want the fewest possible files present, for ports development we basically need to create a full copy of the base system. Thankfully, there is a utility called 'proot' already in the base system specifically to do this:
Using proot to populate the ports chroot directory
The above command will populate the chroot directory with the files that dpb needs from the base installation, as well as the contents of the ports tree from /usr/ports. If it returns any errors, correct them before continuing.
Remember that any changes you make to configuration files in /etc, or anything in else in the base system after populating the chroot directory, will not automatically be applied to the chroot environment. So, for example, if you update the termcap database and change your default termtype, you would likely want to manually copy the changed files to the corresponding locations in /proot/. Alternatively, you could erase the whole partition with newfs and re-run the proot command above without losing the distfiles, although you would lose any packages that you had already compiled, along with their compilation logfiles, unless you backed them up first or created a fourth partition for /proot/usr/ports/packages/.
At this point, dpb is basically ready to use.
The dpb utility resides in /usr/ports/infrastructure/bin/dpb, which isn't in the shell's search path by default. We could just modify the search path, but most of the other programs in that location will be rarely used for simple building of existing ports, so to avoid them being matched when using tab-completion of filenames I prefer to simply add a shell alias for dpb. This has the advantage that we can also automatically specify two useful command line options.
# alias dpb="/usr/ports/infrastructure/bin/dpb -B /proot -D BUILD_USER=_pbuild"
Setting an alias for dpb using the korn shell
The -B option simply enables the chroot and specifies it's location. Invoked without this option, dpb would operate using the normal, non-chroot'ed, /usr/ports directory hierarchy. The -D command line argument allows us to specify various options, (detailed in the dpb manual page), but the one we are using here, BUILD_USER, ensures that as little as possible is actually run as root.
We can quickly test that dpb is working correctly by building a very small port that doesn't have any dependencies:
# dpb audio/metronome
Building a very small port to test dpb
If you don't already have the single distfile containing the source code for metronome in /proot/usr/ports/distfiles/, then dpb will download it. Next it will proceed to checksum, extract, configure, and build the port. In the case of audio/metronome, this process will be almost instantaneous on any vaguely modern machine.
Once dpb has finished running, you'll find various log files in an architecture-specific directory in /proot/usr/ports/logs/. For example, if you are using the amd64 architecture, then you'll find the following in /proot/usr/ports/logs/amd64/:
That's fourteen ‘global’ log files, two directories, (packages, and paths), containing logfiles specific to each port that has been built, and one directory, (locks), which contains lockfiles, to avoid multiple instances of dpb working on the same port simultaneously.
If dpb actually fetched the distfile from a remote server, because it wasn't already present in the distfiles/ directory, you'll have two additional entries in /proot/usr/ports/logs/amd64/. The fetch/ directory contains two global logs detailing successful and unsuccessful download attempts, (good.log, and bad.log). The dist/ directory contains individual logs for each port, with slightly more verbose information, such as the actual invocation of the ftp command used for the download attempt, and the error message, if any, that it returned. You can use these logs to diagnose connectivity issues to particular mirror sites.
The log files in the packages/ directory are actually just symbolic links to those in the paths/ directory, so you can easy find the correct log given either the original path of the port, (such as audio/metronome in this example), or name of the resulting package, (metronome-2.log). If you experience a build failure for a specific port, these are probably the log files that you want to look at.
Dealing with build failures
If the build of a particular port fails due to an error, it will usually leave behind a lock file in /proot/usr/ports/logs/amd64/locks/. In the case of a distfile being unavailable, the name of the lock file will be the same as that of the file that dpb was trying to download to, which in our example would be metronome-2.tar.gz.dist. If dpb is invoked again, this lock file will automatically be cleared and then re-created if the distfile remains unavailable. However, in the case of other errors, such as a required shared library being missing, the lock file that dpb leaves behind will be named according to the port path, with forward slashes replaced by periods. So, again, in the case of our example, we might see a file audio.metronome. This lock file will not be cleared automatically, and future invocations of dpb to compile the same port will error out almost immediately because of it.
Removing stale lock files is easy enough:
# rm -r /proot/usr/ports/logs/amd64/locks/*
Removing stale lockfiles
For interactive use of dpb on a single machine, you might even want to include this lock removal in the dpb alias we defined earlier:
Updating our earlier shell alias to include removal of stale lockfiles before the invocation of dpb
Installing the compiled packages
Of course, compiling a port just creates a binary package, it doesn't automatically install it. Build dependencies of the port are installed automatically, but only within the chroot'ed environment, so don't be surprised when they are not listed if you run pkg_info normally, outside of the chroot.
The final binary packages are created in the /proot/usr/ports/packages/amd64/ directory hierarchy. In our case we will have four subdirectories, all/, ftp/, no-arch/, and tmp/. All of the packages can be found in the all/ subdirectory, whilst ftp/ and no-arch/ contain copies of some of these packages in the form of hard links.
Since the compiled packages are, by default, not cryptographically signed, we need to set the TRUSTED_PKG_PATH environment variable to enable pkg_add to install them:
You might be wondering how the signature information is actually stored in the package file, as binary packages are simply tgz files. Luckily, the gzip format has an option to store a free-form, zero-terminated comment, and signify uses this to embed the signature. In this way, the tgz files can still conform to the gzip format, and can be decompressed manually with normal programs if required.
Be aware that the full path specified for the secret key in the -s argument to signify, is embedded in the signed package's gzip header. This potentially leaks information about valid file paths on the signing machine, and possibly even who signed the package, to anybody who has access to a copy of it.
For example, observe the following hexdump showing the gzip header at the start of a signed binary package, and the command used to sign it. The hexdump clearly reveals that the directory /home/jay/ exists on the signing machine, and implies that a user named jay signed the package:
The value following 'key=' is not actually used or required by pkg_add, but it is part of the data that has been signed. If we modify it with a hex editor in an existing signed package, then the package will fail signature verification. However, if we really want to create signed packages without this field, we can simply modify the zsign function in /usr/src/usr.bin/signify/zsig.c, which is where the field is inserted. Alternatively, we could just use a relative path when specifying the signing key.
Notice that we also used the -n option in the example above, to set the timestamp in the signature to 1970-01-01, instead of using the current system time. The signature timestamp field can also be completely suppressed by editing the same zsign function in zsig.c.
Forcing a clean build
If you're modifying the original source of a port by applying your own patches to it, then you'll want to re-build it from scratch. To do this, you'll probably want to use pkg_delete to remove the installed binary package first:
# pkg_delete metronome
Un-installing a binary package before compiling a newly modified version of it
Next, we can delete the package from each directory in /proot/usr/ports/packages/amd64/:
After these steps, you can start a fresh compile with your modified makefile, additional local patches, or whatever other changes you've made to the port.
Advanced dpb techniques
Now that we've covered the basics of setting up dpb in a chroot environment and building a simple port, let's look at some tricks and techniques that might be useful for non-standard ports building configurations.
Pre-downloading distfiles for off-line building
Whilst it might seem convenient to be able to invoke dpb with a single command, and have it automatically fetch the source code distribution archives for several large ports along with perhaps tens or even hundreds of dependencies over the internet, this assumes that a couple of things are true.
Firstly, that the build machine is connected to a reliable, un-metered, and reasonably fast internet connection, and secondly that you have either already checked what the dependencies of each port are, or alternatively don't care what software is automatically installed on the target machines.
Whilst a cheap, fast internet connection can be taken for granted by many users in 2022, there are still many locations worldwide where internet connectivity is severely limited. In the case of a metered connection, such as a constrained cellular data plan, ensuring that you don't download un-needed distfiles and avoiding re-downloading distfiles that you already have can considerably reduce bandwidth usage.
In these and other cases, it's useful to be able to download the relevant files to prepare a suitably populated distfiles/ directory beforehand, then start the build on a machine with no access to the internet.
This also ensures that only the software you have already downloaded can be packaged and installed, so there will be no surprises with an updated port suddenly pulling in a multitude of dependencies that you're not familiar with, and might not be happy having installed on your machines.
To fetch the required files for a particular port or set of ports, we simply invoke dpb with the -F argument to indicate how many downloads can be started in parallel:
# dpb -F 5 archivers
Fetching the source for various ports, but not compiling them
This invocation of dpb would download all of the required distfiles to build every port in the archivers category. Note that this includes fetching distfiles for all of the different flavours of each port, and even ports which don't build on the architecture of the build machine. This is useful for building a local copy of all of the distfiles for the ports tree in any particular OpenBSD release. If this is not what you want, then you can be more specific in the path list supplied to dpb.
Preventing particular distfiles from being downloaded
If you want to prevent a particular distfile from being downloaded automatically, the easiest way is create a directory with the same name as the distfile but with '.part' appended to it.
This can be useful to avoid attempts to download large distfiles which you know that you won't need.
An example of where this might happen is when a particular port uses different versions of the C compiler depending on which architecture it's being built on.
The files in distfiles/ should be owned by user and group _pfetch. Dpb will do this automatically for files that it downloads, but if you add files manually to the distfiles/ directory, you will probably need to chown them manually. The only exception to this is the build-stats/ directory, which should be owned by user and group _pbuild.
Seeding distfiles/ with distfiles/ from a previous release
Assuming that you upgrade from one OpenBSD release to the next by re-installing, rather than trying to upgrade the system in-situ, you'll need to re-compile all of your ports after the upgrade. Since some, or even many, of the distfiles won't have changed between the two releases, you can save some time and bandwidth by initially copying the contents of the old distfiles directory to the new installation.
Then you can either allow dpb to fetch the remaining required distfiles automatically during the builds, or alternatively fetch them in advance by using the -F option. This will leave behind some old distfiles that were required for the previous release, but are no longer required for the new release, usually due to being replaced by newer versions.
A utility to identify and remove such un-needed distfiles is included in the base installation in /usr/ports/infrastructure/bin/clean-old-distfiles. It relies on the history log created by dpb, so it's a good idea to make sure that that is up to date first:
# dpb -D HISTORY_ONLY
Updating the history log file in preparation for running clean-old-distfiles
This is particularly important if you've been manually copying distfiles around, or otherwise making local changes to the ports tree.
Since the clean-old-distfiles utility doesn't automatically operate within the chroot, we can either invoke it as:
Running clean-old-distfiles outside the chroot environment, but operating on the chroot directory
We invoke clean-old-distfiles with the -n and -v flags to see what would actually be removed. It can then be re-run without the -n flag to actually remove those files.
Downloading distfiles for a new release before installation
Whilst many people may be satisfied with installing the base sytem first and then just letting dpb fetch whatever it likes, there are other approaches.
When upgrading by re-installation, it's useful to have all of the required distfiles for the new release to hand locally before starting the fresh install. This is especially true if you only have one local machine running OpenBSD, if your internet connection is unreliable, or if you want minimise downtime and be back up and running as soon as possible. It's also a good idea if you've made local changes to any of the makefiles, as it gives you an opportunity to make and check corresponding changes to the makefiles in the new ports tree.
There are various ways to approach this.
Firstly, if you're building a very small number of ports, with few or no dependencies that are unlikely to have changed since the last release, you could simply look through the makefiles and manually download any distfiles that have changed. However, this is tedious and error-prone for more than a handful of ports, so not really recommended.
Next, although not guaranteed to work, it's often possible to parse the ports tree from one release on the previous release. Although many or most of the ports won't actually build correctly, due to mis-matched library versions and other changes, it's usually possible to check the required dependencies and fetch the required distfiles using dpb with the -F flag.
One way to set this up is to create another new partition which will eventually contain a chroot environment based on your current base system, with the ports tree from the next release. This can either be mounted on /proot, in place of the existing /proot partition, or alternatively mounted elsewhere, such as /newproot, and the -B parameter of dpb changed to match.
If you don't intend to do any more ports building on the existing system before you upgrade, you could simply erase and re-use the existing /proot partition, although remember to copy off any log files and modified makefiles that you want to preserve first. If you have made local changes to the makefiles, it will be useful to have both the original makefiles from your existing ports tree, and your custom versions to hand, so that you can see the changes that you made and forward-port them to the versions in the new release.
To populate the new chroot directory, we'll need to extract the new ports.tar.gz file somewhere. Since we only need it as a source for /usr/ports/infrastructure/bin/proot to copy the ports tree from, we can simply mount an mfs ramdisk filesystem over /usr/ports/ so that the files are in the expected location for copying, but the exising contents of /usr/ports/ is not lost.
A typical session might look something like this:
Archive the distfiles we used for the current release:
# tar -cvf distfiles_for_this_release.tar /proot/usr/ports/distfiles/
Unmount the existing partitions:
# umount /proot/usr/ports/distfiles
# umount /proot/usr/ports/pobj
# umount /proot
Create a new partition, which for this example we will assume is sd1p:
Now we can start making any desired local changes to the new makefiles, then hopefully eventually invoke dpb with the -F flag and fetch the required distfiles for the new ports tree to distfiles/. This should be everything we'll need after a fresh installation of the new release, to build and install the ports we were previously using.
The above method is quite useful when working on a single machine. However, if you have a spare machine to dedicate to the task, a spare disk that you can swap in to an existing machine, or the possibility to set up a VM, it might be quicker and easier to simply do a quick, default, fresh installation of the new release, quickly set up dpb just to fetch the distfiles, and then copy them to removable media such as a flash drive or optical disc. That installation can then be wiped, and the real installation performed, knowing that you already have a good set of distfiles to hand. This approach of using an actual installation of the new release for the fetching of the distfiles, has the advantage that you can reliably test that the ports actually compile with any local modifications, too.
Doing a temporary installation just to download the required distfiles with dpb might seem pointless, but it has some advantages once you start to make a lot of local changes to the ports tree. Here at Exotic Silicon, we do run fairly heavily customised installations of OpenBSD, and with each new release somebody has to forward-port all of our local changes. This can range from being trivial and almost no work at all, to a major re-factoring where changes to the base system or ports tree conflict with our own. Where updates to a port change it's dependencies, we need to check that all of the newly required software meets our needs and expectations, and doesn't conflict with anything else. We also consider whether we're going to follow the change or replace the software entirely with something written in-house. This whole process can involve building and testing various pieces of software that we haven't used before, and testing various experimental configurations locally. After all of this, it's quicker and easier to copy the new distfiles and makefiles elsewhere and do a fresh installation of OpenBSD, than to go through every single change that we've made and check that we haven't in-advertently adjusted something that will cause problems later on.
This week, we've set up the Distributed Ports Builder, built a port from source, learnt how to sign our own packages, taken a glimpse at some possible information disclosure and how to avoid it, then finally looked at some ways to optimise our downloading and management of distfiles. What a start to our new series!
In a future series, we might see how to modify ports makefiles to make local customisations and reduce dependencies. Next week, however, we'll be looking at custom kernel configurations. See you there!
IN NEXT WEEK'S INSTALLMENT, WE'LL TAKE A BREAK FROM PORTS AND FOLLOW JAY AS HE COMPILES A CUSTOM KERNEL. DON'T MISS IT!