You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
blog.ndumas.com/content/posts/beautiful-builds-with-bazel.md

22 KiB

draft title aliases series date author cover description showFullContent tags
false Beautiful Builds with Bazel
Beautiful Builds with Bazel
building-with-bazel
2023-08-25 Nick Dumas bzlmod makes bazel extremely appealing and isn't hard to grasp for anyone already familiar with go modules. My frustration with make for complex builds led me to bazel. false
bazel
golang
devops

What am I Doing?

I write programs to solve problems. Most of the time these are pretty personal and only get used once or twice, never see the light of day again, and that's fine.

Lately, though, I've been working on tooling for my Obsidian notes and I want to make my tools as accessible as possible. This involves a couple steps that are particularly important, tedious, and error prone when done manually:

  • trying to cross compile my binaries for a relatively long list of cpu/arch combinations
  • build a docker image
    • push that docker image to OCI image repositories
  • run tests
  • run benchmarks
  • cache builds effectively

I've started with a Makefile I stole from some gist I didn't save a link to. This makefile is kinda hefty so I'm gonna focus on compiling go binaries and preparing OCI images that contain those binaries.

This makefile's extremely opinionated, hyper-targeted at Go builds. It assumes that your binaries live in cmd/binaryName/.

# Parameters
PKG = code.ndumas.com/ndumas/obsidian-markdown
NAME = parse-wikilinks
DOC = README.md LICENSE

DISTDIR ?= $(WD)/dist
CMDS := $(shell find "$(CMDDIR)/" -mindepth 1 -maxdepth 1 -type d | sed 's/ /\\ /g' | xargs -n1 basename)
INSTALL_TARGETS := $(addprefix install-,$(CMDS))

VERSION ?= $(shell git -C "$(MD)" describe --tags --dirty=-dev)
COMMIT_ID := $(shell git -C "$(MD)" rev-parse HEAD | head -c8)
LDFLAGS = -X $(PKG).Version=$(VERSION) -X $(PKG).Build=$(COMMIT_ID)

GOCMD = go
GOINSTALL = $(GOCMD) install -a -tags "$(BUILD_TAGS)" -ldflags "$(LDFLAGS)"
GOBUILD = gox -osarch="!darwin/386" -rebuild -gocmd="$(GOCMD)" -arch="$(ARCHES)" -os="$(OSES)" -output="$(OUTTPL)" -tags "$(BUILD_TAGS)" -ldflags "$(LDFLAGS)"

GZCMD = tar -czf
SHACMD = sha256sum
ZIPCMD = zip


build: $(CMDS)
$(CMDS): setup-dirs dep
  $(GOBUILD) "$(CMDPKG)/$@" | tee "$(RPTDIR)/build-$@.out"
install: $(INSTALL_TARGETS)
$(INSTALL_TARGETS):
  $(GOINSTALL) "$(CMDPKG)/$(subst install-,,$@)"

dist: clean build
  for docfile in $(DOC); do \
    for dir in "$(DISTDIR)"/*; do \
      cp "$(PKGDIR)/$$docfile" "$$dir/"; \
    done; \
  done
  cd "$(DISTDIR)"; for dir in ./*linux*; do $(GZCMD) "$(basename "$$dir").tar.gz" "$$dir"; done
  cd "$(DISTDIR)"; for dir in ./*windows*; do $(ZIPCMD) "$(basename "$$dir").zip" "$$dir"; done
  cd "$(DISTDIR)"; for dir in ./*darwin*; do $(GZCMD) "$(basename "$$dir").tar.gz" "$$dir"; done
  cd "$(DISTDIR)"; find . -maxdepth 1 -type f -printf "$(SHACMD) %P | tee \"./%P.sha\"\n" | sh
  $(info "Built v$(VERSION), build $(COMMIT_ID)")

Because this isn't a makefile tutorial, I'm going to just hit the high notes and explain why this isn't working. Given the parameters at the top, it looks in cmd/ for directories and passes them to go build with -ldflags thrown in.

Here we have the machinery behind make bump, github link below. bump is a tool that'll automatically create semantic versioning tags in a git repo based on existing tags. You can bump {patch,minor,major} and it'll create the next tag in the versioning sequence for you.

setup-bump:
  go install github.com/guilhem/bump@latest

bump-major: setup-bump
  bump major

bump-minor: setup-bump
  bump minor

bump-patch: setup-bump
  bump patch

Why does it work?

Automation is a great thing. This makefile inspired me to start actually using semantic versioning diligently. It didn't hurt that I was working on a lot Drone pipelines at the time and was starting to get incredibly frustrated debugging :latest images and never being certain what code was running.

Working with bash is never...pleasant, but it definitely gets the job done. I'm no stranger to shell scripts and the minor modifications needed to get bump integrated and other miscellany I've legitimately fully forgotten by now ( document your code for your own sake ) posed no real burden.

This makefile helped me rapidly iterate on code and release it in a way that was easily consumable, including docker images pushed to my self-hosted registry on Gitea. The pipeline that handles this blog post is using a docker image tagged by the makefile components described above, in fact.

Why doesn't it work?

The real kink in the hose ended up being gox. Gox worked great until I tried to generate alpine builds. It was possible, but I'd have to start changing the makefile pretty significantly, write bash helper functions, and more. I decided that wasn't worth the maintenance overhead pretty quickly and started looking in

it's not "smart". The solutions for cross-compilation ended up being clunky to compose with Docker builds

What are the options?

The only real solution is a smarter build system. I had to choose between hand-rolling something with a bunch of switch statements in bash, or I could look into more modern toolkits. I looked into three:

  • meson
  • bazel
  • scons

The contenders

Bazel looked like it had the most to offer:

  • hermetic builds
  • reproducible builds
  • aggressive, fine-grained caching
  • extensible

All of these fit the bill for what I needed. In particular, it has pretty decent go support through rules_go and gazelle, which we'll look at in more depth later.

There's not a lot to say here, I knew nothing about any of the three candidates and when I started I wasn't certain I'd stick with bazel all the way. Sometimes you just have to try stuff and see how it feels.

Caution

bazel seems to be going through an ecosystem shift from the WORKSPACE paradigm to bzlmod. Documentation does exist, but it might not be in the README yet. I've tested the code here and it works in this narrow case. Caveat emptor.

Getting Going with Gazelle

With that, here is how a modern bzlmod enabled go repo is born.

Building Go code

The first step is, in no particular order, init your git repository and init your go module. The former is helpful for keeping track of when you broke something and the latter is required for gazelle to do its job.

  • go mod init
  • git init

Write your go code. The simplest hello world will work for demonstration purposes.

Create your MODULE.bazel file.

module(
    name = "obsidian-markdown", # set this manually
    repo_name = "code.ndumas.com_ndumas_obsidian-markdown", # this is the name of your go module, with /'s replaces with _'s
)

bazel_dep(name = "gazelle", version = "0.32.0")
bazel_dep(name = "rules_go", version = "0.41.0")

go_deps = use_extension("@gazelle//:extensions.bzl", "go_deps")
go_deps.from_file(go_mod = "//:go.mod")

module() is how you declare a top-level bazel project. Everything is namedspaced under this module.

bazel_dep tells bazel to retrieve modules from the bazel registry.

use_extension imports functions from bazel modules; here we're importing go_deps because it'll read out go.mod file and help bazel automatically calculate direct and transitive dependencies.

and BUILD.bazel

load("@gazelle//:def.bzl", "gazelle")

gazelle(name = "gazelle")

gazelle(
    name = "gazelle-update-repos",
    args = [
        "-from_file=go.mod",
        "-to_macro=deps.bzl%go_dependencies",
        "-prune",
    ],
    command = "update-repos",
)

This is straight from the gazelle README. You load() the gazelle module and declare two build targets: gazelle and gazelle-update-repos. After the rest of the setup, these targets are what will do the work of actually generating build/test targets for all your code.

Next, .bazelrc

common --experimental_enable_bzlmod

# Disable lockfiles until it works properly.
# https://github.com/bazelbuild/bazel/issues/19068
common --lockfile_mode=off

###############################
# Directory structure         #
###############################

# Artifacts are typically placed in a directory called "dist"
# Be aware that this setup will still create a bazel-out symlink in
# your project directory, which you must exclude from version control and your
# editor's search path.
build --symlink_prefix=dist/

###############################
# Output                      #
###############################

# A more useful default output mode for bazel query, which
# prints "ng_module rule //foo:bar" instead of just "//foo:bar".
query --output=label_kind

# By default, failing tests don't print any output, it's logged to a
# file instead.
test --test_output=errors

Only the first line is required; the rest are just conveniences. I do strongly recommend the query setting though, extremely nice for debugging.

Finally, a .gitignore to mask out generated artifacts.

dist/*
reports/*
bazel-*
*.bazel.lock

Run bazel build //:gazelle. This will auto-generate a lot of scaffolding, and probably emit a buildozer command that will modify something. This is the build system (specifically gazelle ) automatically detecting dependencies that are declared in go.mod but not in your bazel code.

$ bazel run //:gazelle
WARNING: /home/ndumas/work/gomud/MODULE.bazel:8:24: The module extension go_deps defined in @gazelle//:extensions.bzl reported incorrect imports of repositorie
s via use_repo():

Not imported, but reported as direct dependencies by the extension (may cause the build to fail):
    com_github_therealfakemoot_go_telnet

 ** You can use the following buildozer command(s) to fix these issues:

buildozer 'use_repo_add @gazelle//:extensions.bzl go_deps com_github_therealfakemoot_go_telnet' //MODULE.bazel:all
INFO: Analyzed target //:gazelle (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //:gazelle up-to-date:
  dist/bin/gazelle-runner.bash
  dist/bin/gazelle
INFO: Elapsed time: 0.473s, Critical Path: 0.01s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Running command line: dist/bin/gazelle
$ git st
## dev
?? .bazelrc
?? .gitignore
?? BUILD
?? MODULE.bazel
?? cmd/BUILD.bazel
?? protocol/BUILD.bazel
$

Running the buildozer command and then git diff shows its work:

$ buildozer 'use_repo_add @gazelle//:extensions.bzl go_deps com_github_therealfakemoot_go_telnet' //MODULE.bazel:all
fixed /home/ndumas/work/gomud/MODULE.bazel
$ git st
## dev
 M MODULE.bazel
$ git diff
diff --git a/MODULE.bazel b/MODULE.bazel
index b482f31..8e82690 100644
--- a/MODULE.bazel
+++ b/MODULE.bazel
@@ -7,3 +7,4 @@ bazel_dep(name = "gazelle", version = "0.32.0")

 go_deps = use_extension("@gazelle//:extensions.bzl", "go_deps")
 go_deps.from_file(go_mod = "//:go.mod")
+use_repo(go_deps, "com_github_therealfakemoot_go_telnet")
$

This diff shows how bazel references external dependencies. gazelle's go_deps tool acts as a provider for these lookups and offers information bazel needs to verify its build graphs. Yours may look different depending on what you've imported, if anything.

Examining the produced BUILD.bazel file should yield something like this for a main package.

load("@rules_go//go:def.bzl", "go_binary", "go_library")

go_library(
    name = "echo_lib",
    srcs = ["server.go"],
    importpath = "code.ndumas.com/ndumas/gomud/cmd/echo",
    visibility = ["//visibility:private"],
)

go_binary(
    name = "echo",
    embed = [":echo_lib"],
    visibility = ["//visibility:public"],
)

If the package is importable, you'll see something like this:

load("@rules_go//go:def.bzl", "go_library")

go_library(
    name = "protocol",
    srcs = ["telnet.go"],
    importpath = "code.ndumas.com/ndumas/gomud/protocol",
    visibility = ["//visibility:public"],
)

These are examples of rules_go build targets. These do a bunch of magic to invoke Go toolchains and in theory let bazel cache builds at a pretty granular level. I'm hoping this is true, I've got a few pipelines that are starting to run way longer than I like.

OCI Images

For ease of use, I like to build docker images containing my packages. This is particularly important for Drone pipelines.

We're gonna amend our MODULE.bazel to add some new tools.

bazel_dep(name = "rules_oci", version = "1.3.1") # gives us ways to interact with OCI images and repositories
bazel_dep(name = "rules_pkg", version = "0.9.1") # exposes a way to tar our app, which is necessary for packing with rules_oci


oci = use_extension("@rules_oci//oci:extensions.bzl", "oci")
oci.pull(
    name = "distroless_base",
    image = "gcr.io/distroless/base",
    tag = "latest", # This is temporary. For reproducible builds, you'll want to use digest hashes.
)

use_repo(oci, "distroless_base")

pull() does more or less what it says: it creates a target that represents an OCI image pulled from a registry, and another use_repo() call tells bazel that we're using our image.

And add this to the BUILD.bazel file for the binary you want built into an OCI image

load("@rules_pkg//:pkg.bzl", "pkg_tar")

pkg_tar(
    name = "tar",
    srcs = [":echo"],
)

load("@rules_oci//oci:defs.bzl", "oci_image")

oci_image(
    name = "image",
    base = "@distroless_base",
    entrypoint = ["/echo"],
    tars = [":tar"],
)

oci_image requires that whatever you package into the image it creates be contained in a tar file, which seems pretty reasonable. rules_pkg handles that for us.

Run bazel build //cmd/echo:image and you'll see another buildozer command and a lot of errors. This is to be expected, bazel wants builds to be reproducible and because we haven't specified a version or a hash it can't do that. It helpfully emits the buildozer command that'll set the proper digest hash and platforms bazel needs to resolve its builds.

bazel build //cmd/echo:image
WARNING: fetching from https://gcr.io/v2/distroless/base/manifests/latest without an integrity hash. The result will not be cached.
WARNING: for reproducible builds, a digest is recommended.
Either set 'reproducible = False' to silence this warning,
or run the following command to change oci.pull to use a digest:
(make sure you use a recent buildozer release with MODULE.bazel support)

buildozer 'set digest "sha256:73deaaf6a207c1a33850257ba74e0f196bc418636cada9943a03d7abea980d6d"' 'remove tag' 'remove platforms' 'add platforms "linux/amd64" "
linux/arm64" "linux/arm" "linux/s390x" "linux/ppc64le"' MODULE.bazel:distroless_base

WARNING: fetching from https://gcr.io/v2/distroless/base/manifests/latest without an integrity hash. The result will not be cached.
INFO: Repository rules_oci~1.3.1~oci~distroless_base_single instantiated at:
  callstack not available
Repository rule oci_pull defined at:
  /home/ndumas/.cache/bazel/_bazel_ndumas/482ba52ed14b5c036eb1d379e90911a8/external/rules_oci~1.3.1/oci/private/pull.bzl:437:27: in <toplevel>
ERROR: An error occurred during the fetch of repository 'rules_oci~1.3.1~oci~distroless_base_single':
   Traceback (most recent call last):
        File "/home/ndumas/.cache/bazel/_bazel_ndumas/482ba52ed14b5c036eb1d379e90911a8/external/rules_oci~1.3.1/oci/private/pull.bzl", line 373, column 17, in
_oci_pull_impl
                fail("{}/{} is a multi-architecture image, so attribute 'platforms' is required.".format(rctx.attr.registry, rctx.attr.repository))
Error in fail: gcr.io/distroless/base is a multi-architecture image, so attribute 'platforms' is required.
ERROR: <builtin>: fetching oci_pull rule //:rules_oci~1.3.1~oci~distroless_base_single: Traceback (most recent call last):
        File "/home/ndumas/.cache/bazel/_bazel_ndumas/482ba52ed14b5c036eb1d379e90911a8/external/rules_oci~1.3.1/oci/private/pull.bzl", line 373, column 17, in
_oci_pull_impl
                fail("{}/{} is a multi-architecture image, so attribute 'platforms' is required.".format(rctx.attr.registry, rctx.attr.repository))
Error in fail: gcr.io/distroless/base is a multi-architecture image, so attribute 'platforms' is required.
ERROR: /home/ndumas/.cache/bazel/_bazel_ndumas/482ba52ed14b5c036eb1d379e90911a8/external/rules_oci~1.3.1~oci~distroless_base/BUILD.bazel:1:6: @rules_oci~1.3.1~
oci~distroless_base//:distroless_base depends on @rules_oci~1.3.1~oci~distroless_base_single//:distroless_base_single in repository @rules_oci~1.3.1~oci~distro
less_base_single which failed to fetch. no such package '@rules_oci~1.3.1~oci~distroless_base_single//': gcr.io/distroless/base is a multi-architecture image,
so attribute 'platforms' is required.
ERROR: Analysis of target '//cmd/echo:image' failed; build aborted:
INFO: Elapsed time: 2.434s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (27 packages loaded, 341 targets configured)

A diff after running should show something like this:

git diff
diff --git a/MODULE.bazel b/MODULE.bazel
index 4d9ba08..682985b 100644
--- a/MODULE.bazel
+++ b/MODULE.bazel
@@ -15,8 +15,14 @@ use_repo(go_deps, "com_github_therealfakemoot_go_telnet")
 oci = use_extension("@rules_oci//oci:extensions.bzl", "oci")
 oci.pull(
     name = "distroless_base",
+    digest = "sha256:73deaaf6a207c1a33850257ba74e0f196bc418636cada9943a03d7abea980d6d",
     image = "gcr.io/distroless/base",
-    tag = "latest",
+    platforms = [
+        "linux/amd64",
+        "linux/arm",
+        "linux/arm64",
+        "linux/ppc64le",
+        "linux/s390x",
+    ],
 )
-
 use_repo(oci, "distroless_base")

And then re-running bazel build //cmd/echo:image should complete successfully:

bazel build //cmd/echo:image
INFO: Analyzed target //cmd/echo:image (22 packages loaded, 9284 targets configured).
INFO: Found 1 target...
Target //cmd/echo:image up-to-date:
  dist/bin/cmd/echo/image
INFO: Elapsed time: 5.799s, Critical Path: 0.85s
INFO: 17 processes: 12 internal, 2 linux-sandbox, 3 local.
INFO: Build completed successfully, 17 total actions

Pushing our image to a repository ends up being relatively simple after all the legwork. The diff below shows the full changes but in summary

  • change the load() call for rules_oci. It's variadic and takes an arbitrary number of arguments indicating names to import. Add oci_push to the list.
  • Use the imported oci_push rule to set tags and the destination registry
diff --git a/cmd/echo/BUILD.bazel b/cmd/echo/BUILD.bazel
index 4f52043..44d8a6c 100644
--- a/cmd/echo/BUILD.bazel
+++ b/cmd/echo/BUILD.bazel
@@ -20,7 +20,7 @@ pkg_tar(
     srcs = [":echo"],
 )

-load("@rules_oci//oci:defs.bzl", "oci_image")
+load("@rules_oci//oci:defs.bzl", "oci_image", "oci_push")

 oci_image(
     name = "image",
@@ -28,3 +28,10 @@ oci_image(
     entrypoint = ["/echo"],
     tars = [":tar"],
 )
+
+oci_push(
+    name = "registry",
+    image = ":image",
+    repository = "code.ndumas.com/ndumas/gomud",
+    remote_tags = ["latest"],
+)

Running bazel run //cmd/echo:registry will push your image, as long as you'd otherwise be able to use docker push or similar. You will need to inject authentication details into your build pipelines, etc.

$ bazel run //cmd/echo:registry
INFO: Analyzed target //cmd/echo:registry (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //cmd/echo:registry up-to-date:
  dist/bin/cmd/echo/push_registry.sh
INFO: Elapsed time: 0.330s, Critical Path: 0.01s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Running command line: dist/bin/cmd/echo/push_registry.sh
2023/08/19 13:03:24 pushed blob: sha256:b02a7525f878e61fc1ef8a7405a2cc17f866e8de222c1c98fd6681aff6e509db
2023/08/19 13:03:24 pushed blob: sha256:f5a45b52c7f9934ccad7dce04c930af615270af739e172b7ff46c7b34689578c
2023/08/19 13:03:24 pushed blob: sha256:a7ca0d9ba68fdce7e15bc0952d3e898e970548ca24d57698725836c039086639
2023/08/19 13:03:24 pushed blob: sha256:fcb6f6d2c9986d9cd6a2ea3cc2936e5fc613e09f1af9042329011e43057f3265
2023/08/19 13:03:24 pushed blob: sha256:fe5ca62666f04366c8e7f605aa82997d71320183e99962fa76b3209fdfbb8b58
2023/08/19 13:03:24 pushed blob: sha256:e8c73c638ae9ec5ad70c49df7e484040d889cca6b4a9af056579c3d058ea93f0
2023/08/19 13:03:24 pushed blob: sha256:4aa0ea1413d37a58615488592a0b827ea4b2e48fa5a77cf707d0e35f025e613f
2023/08/19 13:03:24 pushed blob: sha256:1e3d9b7d145208fa8fa3ee1c9612d0adaac7255f1bbc9ddea7e461e0b317805c
2023/08/19 13:03:24 pushed blob: sha256:7c881f9ab25e0d86562a123b5fb56aebf8aa0ddd7d48ef602faf8d1e7cf43d8c
2023/08/19 13:03:24 pushed blob: sha256:5627a970d25e752d971a501ec7e35d0d6fdcd4a3ce9e958715a686853024794a
2023/08/19 13:03:24 pushed blob: sha256:08553ba93cfea7ad45b59911d8ed0a025489e7c3623920dfda331b9a49f1e8aa
2023/08/19 13:03:24 pushed blob: sha256:96266735468f361ae6828901a80fc15a7f75e26640351df9e0f0f9824f36cf92
2023/08/19 13:03:24 pushed blob: sha256:2758d0c31c8ca76c3379e7b1be20adc4144e9230873bb2c5bdb41f3691fa75bc
2023/08/19 13:03:24 pushed blob: sha256:fce64026d8c539f2a8cd7d81f173f94cffab1311a15d5578e451f66404b5a1eb
2023/08/19 13:03:24 code.ndumas.com/ndumas/gomud@sha256:eaf1ff753e1dca1a9dc20b635ff5276de5633824232d8bdd59555757c3ab024e: digest: sha256:eaf1ff753e1dca1a9dc20b
635ff5276de5633824232d8bdd59555757c3ab024e size: 2275
2023/08/19 13:03:25 code.ndumas.com/ndumas/gomud:latest: digest: sha256:eaf1ff753e1dca1a9dc20b635ff5276de5633824232d8bdd59555757c3ab024e size: 2275
$

And with that, you've got an OCI image pushed to your repository of choice. Note that bazel relies on the environment to provide an OCI toolchain and the authorization. I've got my drone credentials in environment variables, but your setup may vary.

Success Story???

The next step forward is to take a step backwards: integrate bazel into a Makefile. make is actually pretty nice as a task-runner; now that bazel can handle the top-to-bottom process of builds, the makefile doesn't need much, if any, logic in it. All it'll have to do is serve as fancy aliases for bazel invocations.

I also haven't actually set up cross-compilation. Work for another day.

Useful Tips

Supported go compilation targets

I haven't used this one yet, but it's handy for manually cross-compiling.

bazel query 'kind(platform, @rules_go//go/toolchain:all)'