go-benchmarks/sweet
Michael Anthony Knyszek d78363a64b sweet: add GOMAXPROCS suffix to benchmark names
This change adds the GOMAXPROCS suffix to benchmark names the same way
that `go test` does.

Change-Id: I3ae168620de716c0aa8b642d099ba5308c54f48b
Reviewed-on: https://go-review.googlesource.com/c/benchmarks/+/580837
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2024-04-25 02:55:21 +00:00
..
benchmarks sweet: add GOMAXPROCS suffix to benchmark names 2024-04-25 02:55:21 +00:00
cli sweet: print additional details during a subprocess failure 2022-02-07 17:34:42 +00:00
cmd/sweet sweet: make tile38 and go-build with -short even shorter 2023-06-27 19:24:32 +00:00
common sweet/common/diagnostics: gofmt format 2023-08-18 22:04:20 +00:00
generators sweet: don't fetch large github repos for go-build benchmark if short 2023-05-31 15:34:51 +00:00
harnesses sweet: make tile38 and go-build with -short even shorter 2023-06-27 19:24:32 +00:00
source-assets/gvisor sweet: add sweet benchmarks 2021-12-01 23:08:58 +00:00
.gitignore sweet: ignore config file and results directory 2022-10-24 21:37:47 +00:00
README.md sweet: implement misc UX improvements 2022-11-10 21:36:48 +00:00
assets.hash sweet: reduce size of bleve index for bleve-query 2022-02-03 18:12:48 +00:00

README.md

Sweet: Benchmarking Suite for Go Implementations

Sweet is a set of benchmarks derived from the Go community which are intended to represent a breadth of real-world applications. The primary use-case of this suite is to perform an evaluation of the difference in CPU and memory performance between two Go implementations.

If you use this benchmarking suite for any measurements, please ensure you use a versioned release and note the version in the release.

Quickstart

Supported Platforms

  • linux/amd64

Dependencies

The sweet tool only depends on having a stable version of Go and git.

Some benchmarks, however, have various requirements for building. Notably they are:

  • make (tile38)
  • bash (tile38)
  • binutils (tile38)

Please ensure your system has these tools installed and available in your system's PATH.

Furthermore, some benchmarks are able to produce additional information on some platforms. For instance, running on platforms where systemd is available adds an average RSS measurement for the go-build benchmark.

gVisor

The gVisor benchmark has additional requirements:

  • The target platform must be linux/amd64. Nothing else is supported or ever will be.
  • The ptrace API must be enabled on your system. Set /proc/sys/kernel/yama/ptrace_scope appropriately (0 and 1 work, 2 might, 3 will not).

Build

$ go build ./cmd/sweet

Getting Assets

$ ./sweet get

Running the benchmarks

Create a configuration file called config.toml with the following contents:

[[config]]
  name = "myconfig"
  goroot = "<insert some GOROOT here>"

Run the benchmarks by running:

$ ./sweet run -shell config.toml

Benchmark results will appear in the results directory.

-shell will cause the tool to print each action it performs as a shell command. Note that while the shell commands are valid for many systems, they may depend on tools being available on your system that sweet does not require.

Note that by default sweet run expects to be executed in /path/to/x/benchmarks/sweet, that is, the root of the Sweet subdirectory in the x/benchmarks repository. To execute it from somewhere else, point -bench-dir at /path/to/x/benchmarks/sweet/benchmarks.

Tips and Rules of Thumb

  • If you're not confident if your experimental Go toolchain will work with all the benchmarks, try the -short flag to run to get much faster feedback on whether each benchmark builds and runs.
  • You can expect the benchmarks to take a few hours to run with the default settings on a somewhat sizable Linux box.
  • If a benchmark fails to build, run with -shell and copy and re-run the last command to get full output. TODO(mknyszek): Dump the output to the terminal.
  • If a benchmark fails to run, the output should have been captured in the corresponding results file (e.g. if biogo-igor failed, check /path/to/results/biogo-igor/myconfig.results) which is really just the stderr (and usually stdout too) of the benchmark. You can also try to re-run it yourself with the output of -shell.

Memory Requirements

These benchmarks generally try to stress the Go runtime in interesting ways, and some may end up with very large heaps. Therefore, it's recommended to run the suite on a system with at least 16 GiB of RAM available to minimize the chance that results are lost due to an out-of-memory error.

Configuration Format

The configuration is TOML-based and a more detailed description of fields may be found in the help docs for the run subcommand:

$ ./sweet help run

Results Format

Results are produced into a single directory containing each benchmark as a sub-directory. Within each sub-directory is one file per configuration containing the stderr (and usually combined stdout) of the benchmark run, which also doubles as the benchmark output format.

All results are reported in the standard Go testing package format, such that results may be compared using the benchstat tool.

Results then may also be composed together for easy viewing. For example, if one runs sweet with two configurations named config1 and config2, then to quickly compare all results, do:

$ cat results/*/config1.results > config1.results
$ cat results/*/config2.results > config2.results
$ benchstat config1.results config2.results

Noise

This benchmark suite tries to keep noise low in measurements where possible.

  • Each measurement is taken against a fresh OS process.
  • Benchmarks have been modified to reduce noise from the input.
    • All inputs are deterministic, including implicit inputs, such as querying an existing database.
    • Inputs are loaded into memory when possible instead of streamed from disk.
  • The suite mitigates external effects we can control (e.g. the suite is aware of its co-tenancy with the benchmark and throttles itself when the benchmarks are running).

Tips for Reducing Noise

  • Sweet should be run on a dedicated machine where a perflock daemon is running (to avoid noise due to CPU throttling).
  • Avoid running these benchmarks in cloud environments if possible. Generally the noise inherent to those environments can skew A/B tests and hide small changes in performance. See this paper for more details. Try to use dedicated hardware instead.

Do not compare results produced by separate invocations of the sweet tool.