cross-building linux

A process for building a simple linux installation for a different machine architecture than the one you start with. There's a blog, for those who are interested: http://crossbuiltlinux.blogspot.com/


CROSS-BUILDING LINUX: The Little Blue Linux build process

This project contains instructions for building a complete, minimal,
GNU/Linux system from source code. If you follow these instructions
precisely, what you'll wind up with is a Little Blue Linux system. If
you follow these instructions with modifications or alterations, you'll
wind up with a derivative of Little Blue Linux instead!

The easiest way to use Cross-Building Linux (generally abbreviated CBL)
is to process the files in this project with the `litbuild` program,
available as a Ruby Gem. This will produce a human-readable AsciiDoc
version of the instructions, which can then be further processed to
produce HTML or PDF or other formats if you wish. It will also produce
scripts that implement the full set of build instructions; if you run
those scripts, you'll wind up with a Little Blue Linux system with
little or no manual activity needed on your part.

The main point of this README is to walk you through the process of
using litbuild to do those things.

Alternatively, you can just read the files in this project. That's more
difficult, because they are in a format designed to be easy to write and
easy to process programmatically -- it's _not_ designed to be easy to
read! But you can do it if you want to. Start with `sections/cbl.txt`.


I don't currently publish the litbuild gem to rubygems.org, so you have
to build it yourself. The git repository is located at
http://git.freesa.org/freesa/litbuild and there is a tarfile at
https://repo.freesa.org/cbl/ -- check out the README in that project for
more details. Litbuild requires Ruby at least 2.5 to work.

Once you've got litbuild installed, you can use the `lb` command-line
program to produce the human- and machine-readable versions of the CBL
process as described below.


Just run `lb cbl`. This will produce a file `cbl.adoc` in the directory
`/tmp/build/docs`. If you want to adjust the parameters used to produce
the document, or the location where it will be generated, you can set
environment variables as described in `narratives/configuration.txt`
before running `lb`.

AsciiDoc is a format, similar to markdown, that is designed to be as
readable as plain text files. However, it can also be processed by a
ruby program, `asciidoctor`, to produce a variety of output formats.
If you want a HTML version of the CBL book, just:

  gem install asciidoctor # (if you don't have it installed already)
  cd /tmp/build/docs
  asciidoctor cbl.adoc

And voila, a `cbl.html` file will be produced.

Producing a PDF version of the book is a _little_ more involved:

  gem install asciidoctor # (if it's not already installed)
  gem install asciidoctor-pdf # ibid
  cd /tmp/build/docs
  asciidoctor-pdf cbl.adoc

Other formats are also possible but I'm not addressing them here.


The architecture we've found to work most conveniently with
QEMU-emulated target systems is ARM (it supports a "virt" machine type
that can be given a lot more memory than is usual for emulated
machines), so that's the suggested architecture to start with. If you
have an actual machine you're targeting with CBL, of course you can use
that one instead! But it might not work as smoothly.

1. Grab all the tarfiles and patch files you need to build the CBL
   system. For your convenience, you can grab `cbl.tar` from
   `http://repo.freesa.org/` to get everything at once (or you can get
   individual files from there, if you prefer; or, for that matter, you
   can get source distribution files from the upstream project sites, if
   you read and understand the IMPORTANT SAFETY TIP below.) Put them in
   `/tmp/cbl-materials`, or in some other location if you prefer.

   IMPORTANT SAFETY TIP: Litbuild presumes that the filename for the
   source tarfiles is the package name, followed by a hyphen, followed by
   the version number, followed by `.tar` and optionally the compression
   suffix `.lz`. (Other compression programs besides lzip can be used,
   but the package-users build script currently only handles lzip.) The
   tarfiles must expand into a directory named package name, hyphen,
   version. This is the convention used by the GNU project, so most
   packages already conform to it. When the upstream project
   distribution has some other convention for the filename or unpacked
   path, you need to fix that before you can use it with litbuild! All
   of the source tarfiles on repo.freesa.org have already been
   rejiggered to conform to this convention.

2. Review the configuration section (`narratives/configuration.txt`) and
   see if the default parameter values are what you want. If they are
   not, override the parameters you want to change in

3. Get rid of the work directory (default `/tmp/build`), if it exists,
   and generate the CBL build scripts with `source
   configs/amd64-aarch64.sh; lb -c cbl`. (The `-c` argument tells
   litbuild to crash immediately if any of the source files it will need
   are not present.)

6. If there is a file `/tmp/build/scripts/lb-sudoers`, ensure that it
   is included in your sudoers file (perhaps by copying it to
   /etc/sudoers.d and chowning it to be root:root with a restrictive
   mode like 0400).

7. Run the script `/tmp/build/scripts/00-cbl.sh` that was just produced
   by litbuild.

8. Wait a while -- if the host or target system is emulated, this could
   be a long while, perhaps a couple of weeks! (On my quad-core 64-bit
   Intel laptop, the host-side build takes about 20 hours and a
   target-side build on an Aarch64 "virt" emulated machine with four
   cores and eight GB of RAM takes around eight days) -- to see if you


If you are doing a build with the default configuration -- the host
system is an actual computer, the target system is a QEMU emulated
virtual machine, networking is enabled in the target system -- you'll
need to set up the host system as a DHCP server and gateway for the
final CBL system in order to make it network-accessible. (You can refer
to the `setup-networking` blueprint for a discussion of all this.)

Here's how to do this:

1. Get rid of `/tmp/build`, if it exists, and generate build and
   configure scripts with `source configs/amd64-aarch64.sh; lb -c

2. Run the script `/tmp/build/scripts/00-host-dhcp-server.sh` that was
   just produced by litbuild.

3. Wait a while to see if you get a HUGE SUCCESS.

4. If all went well, you can now set up networking suitable for CBL:
   before launching the new CBL system in QEMU, run `sudo
   HOST_TOOLS/bin/launch-target-network`. This will create a bridge
   interface that CBL target machines will use as a switch or hub and
   run a DHCP server listening on the bridge. The DHCP server will be
   running in the foreground, so leave that window alone!

5. Run `sudo HOST_TOOLS/bin/create-cbl-network N` where N is a number
   distinct to each virtual machine you're going to be running.  This
   will create a TAP interface, connected to the bridge, that will be
   used by a QEMU virtual machine.

6. Run QEMU for the target system with appropriate parameters to provide
   the TAP device to the VM -- this is similar to the QEMU command run
   by the `launch-qemu` blueprint, but with the init command as
   `/sbin/init` and the network options:

   -netdev tap,id=network0,ifname=tapcblN,script=no,downscript=no
   -device e1000,netdev=network0,mac=aa:bb:cc:dd:ee:00

   The `tapcblN` should use the same `N` you used in step 5. Depending
   on what drivers are supported by the virtual machine kernel, you may
   want to use a different device than `e1000` -- the best performance
   will probably be with `virtio-net-device`, if the CBL kernel supports
   it. You need to use a different MAC address for each virtual machine.

7. If you want to have the ability to reach the network via the host
   server, you can also run `sudo HOST_TOOLS/bin/enable-target-internet`.


(As these notes are incorporated (and expanded upon) in the CBL
narrative, they'll be removed from this section.)

- Dependency tracking -- by observing what files were accessed during a
  build, it should be possible to infer what packages that build depends
  on (e.g., do a `touch /tmp/TIMESTAMP && sleep 60` before running a
  build, and  `find / -type f -a -executable -a -anewer /tmp/TIMESTAMP`
  after it's complete). This will be most effective when using package
  users, since looking at the ownership of all the files produced by
  that command will reveal the package responsible for them. So the best
  place to add this feature might be in the package-users build script.

- Bootstrap from C -- You have to have _some_ binary code to start
  with, unless you want to do some hand-compilation and hand-assembly to
  get the initial compiler built. The absolute smallest programming
  language footprint you can start with is the language the operating
  system kernel is written in, so that's the best starting point. That
  means any _other_ programming languages and tools that will wind up on
  a CBL system (go, java, docker, etc) have to be bootstrapped directly
  or indirectly from C or C++. (Or copied in as binaries produced on
  some other system, but that's not something that we're going to
  endorse or support.)

  Since modern GCC is written in C++, this requires starting with GCC
  4.7.x. With that in mind, it might be worthwhile to start with an
  initial GCC 4.7.x in the host-prerequisites section -- unfortunately,
  this would require also using an older glibc (as of this writing,
  latest-stable glibc is 2.27, which requires GCC 4.9 or newer to build)
  and possibly older versions of other packages as well.

- Multiple tool sets. GCC is the default C/C++ toolchain, but LLVM works
  pretty well these days and it would be awesome to be able to use
  either of them as the initial cross-compiler, or as the initial
  host-side compiler. Similarly, glibc is the default C library, but
  musl and uclibc-ng are both perfectly good alternatives at this point.
  It would be a good idea to support multiple tools and libraries!

- Single-arch -- the multi-lib and multi-arch systems used by most
  distributions (and enforced by the current GNU toolchain) are messy
  and confusing, so CBL avoids them except where it is completely
  unavoidable. The multi-target, multi-ABI stuff for gcc results in
  horrible complexity and confusion. The specs file for the final system
  GCC should ideally only have the minimal set of options to support a
  single target and C library. Other (single-arch) toolchains can be set
  up to support other targets.

  An idea under consideration: move all libraries and toolchain-related
  files to subdirectories of a new top-level `/arch` directory
  (completely abandoning the `/lib` directory prescriptions of the FHS).