Gradle vs Bazel for JVM Projects

June 30, 2020

General

Introduction
Performance
Build authoring and maintenance
Extensibility
- Custom plugins / rules
- Build correctness
Dependency management
- Dynamic and rich version declarations
- Transitive dependency management
Other Features
Popularity
Summary

Introduction

Introduction #

Gradle has emerged as the build tool of choice for projects within the JVM ecosystem, including Kotlin. It is the most popular build tool for open source JVM projects on GitHub. It is downloaded on average more than 15 million times per month and has been counted in the Top 20 Most Popular Open Source Projects for IT by Techcrunch. Many popular projects have migrated from Maven to Gradle, with Spring Boot being a prominent example.

Recently, we have received inquiries about the suitability of Google’s Bazel build tool for usage within JVM environments. What follows is a detailed comparative performance analysis and evaluation of key capabilities, that arrives at three major conclusions:

Despite Bazel’s strong and well-deserved reputation for performance and scalability, Gradle outperforms Bazel in almost every scenario we tested.
Optimizing projects for Bazel comes at a significant cost for build authoring and maintenance.
Gradle provides more compelling features and conveniences for common use cases in JVM projects.

In summary, the data and analysis indicates clearly that Gradle is a better choice than Bazel for most JVM projects. We will provide an equivalent comparison for Android projects in a follow up article.

At the same time, recognizing that individual tools have unique strengths in addressing the needs and requirements of specific developer ecosystems and specific use cases, we expect that build tool specialization and fragmentation will continue to be the norm across and even within these ecosystems (e.g. Gradle and Maven).

In recognition of this, Gradle Enterprise provides analytics and acceleration for more than just Gradle. This allows users to benefit from faster and more reliable builds without migrating to any specific build tool. Today this includes Gradle and Maven, but will include Bazel and other tools in the future. To learn more about the acceleration technology supported in Gradle Enterprise for the Gradle, Maven, and soon Bazel build tools, watch this video entitled “Speed Up Maven Builds | Maven Build Cache Technology & Business Case Explained.”

Performance #

How well a build system performs for your project depends on a variety of factors. These include the size and structure of your project, the toolchain you are using, and your workflows. In this section we describe our approach to making performance comparisons and the results based on representative scenarios.

About the Performance Comparison #

Gradle and Bazel have different approaches and areas of focus with respect to performance optimization. It would be easy to create some test projects to show that either Gradle is dramatically faster than Bazel or vice versa for a particular performance scenario. The question is always how applicable the results of such test projects are for real life projects and, more importantly, for your project.

Our approach for this comparison is to model common Java project types. This ranges from large repositories with many million lines of code to small library/microservices projects. In this article we will not cover the advantages and disadvantages of the various ways to approach structuring code, either within a single large source repository or across many smaller source repositories. Both approaches have a significant footprint in the JVM ecosystem and often also within a single organization. It requires a serious migration effort and change of workflows and culture to switch from one approach to another. A build system for the JVM should support both approaches well.

Performance Measurement Scenarios #

The different scenarios are measured with the following project sizes and shapes:

Project name	No of modules	No of production classes	No of test classes	LOC
Small multi-project	10	500	500	90k
Medium multi-project	100	10 000	10000	1.8M
Large multi-project	500	50000	50000	9M
Large monolithic	1	50000	50000	7M

All of the test projects are open source, including the runners that were used to take the measurements. See the instructions for details on how to reproduce the measurements.

For each scenario below, Gradle is benchmarked using the latest performance optimizations of Gradle 6.5 including a configuration cache and file-system watching. Some of the optimizations used are experimental and were enabled by feature flags.

Scenario 1: Full Build #

This scenario is a full build with no existing output in the build output directory and no local or remote build cache available. The external dependencies are available in the local dependency cache. This build includes building all class files of the project, packaging them as JAR as well as running all unit tests.

Both Gradle and Bazel users rarely run full builds locally. Even on CI, completely full builds can usually be avoided thanks to build caching (more on that later). Still, this is an interesting scenario because it approximates a change that affects a large portion of the codebase, like an API change in a library that is used by most of the codebase.

Additionally this scenario is interesting for Maven users to see if the performance advantages over Maven come from caching or other factors since they usually do clean builds for reliability reasons.

Gradle and Bazel both have performance advantages over Maven even without build caching and optimizations for incremental builds. Gradle is the winner in this category.

Scenario 2: Remote Caching #

The most effective way to make builds and tests faster is by rebuilding only what has changed. There are multiple layers to this approach. One is a remote build cache. A remote build cache provides output across machines for build actions that have unchanged inputs.

There are two major workflows that get accelerated this way.

The first is the speed of your CI builds. CI builds run on many agents, often throwing away any previous state before they start. So they often build everything from scratch although the actual new commits they are building only require rebuilding a few areas of the codebase. With a remote build cache, build actions with inputs that are not affected by the new commits can retrieve the output from another, previous build on any other CI machine in your organization. This both dramatically reduces CI build time and saves significant compute resources.

A remote cache also helps speed up local builds whenever you pull the latest changes from version control and then run the build locally after a CI build has already populated the cache.

Both Gradle and Bazel provide support for remote caching. Managed cache backends are available for both tools (e.g. Gradle Enterprise and Google Cloud Storage). Gradle Enterprise also provides remote build caching for Maven.

This scenario executes a full build just like in scenario 1, but this time leverages a remote build cache that has matching output for every build action. This approximates an ephemeral (i.e. no re-use of previous local state) CI build agent running a build for changes that only affect a small part of the codebase or a first build on a local machine.

Gradle also provides the best performance in this scenario.

Note that the remote caching scenarios are the hardest to compare as users get significantly different results depending on their network connection speed and their proximity to the remote cache node. For all tests, the remote cache was deployed to a geographically close data center and accessed via a fast and reliable home Internet connection.

The comparison does not include the “monolith” scenario where all the code is in a single module, because remote caching offers little benefit for such projects.

Scenario 3: Incremental builds #

One of the most important aspects of developer productivity is the feedback time for local incremental builds. It can be measured by how long it takes for developer builds to get feedback after one or a few lines of code have been changed. Reusing local state from previous build runs on the same machine is key to perform well in those scenarios. As nothing needs to be sent over the wire, or very often not even copied, this is faster than using a remote cache.

In the next scenario, the project has been built successfully. Another build is run after a single line of code was changed. This is an approximation of a typical local build. There are two variations of this scenario—with and without a change to a public interface of a class/method. A change not affecting the API allows build tools to apply additional optimizations.

For the numbers in the first Maven column we did not run clean builds and it’s important to note that Maven is not reliable when not running clean. Even without clean, it also still executes most of the goals from scratch. Therefore you are taking the risk of a corrupt build for only a moderate performance gain. If you want a faster and yet reliable incremental build with Maven you can use Gradle Enterprise that provides a local build cache for Maven.

Both Gradle and Bazel build reliably without removing the output of a previous build (using clean). They both perform well in incremental scenarios with well modularized projects. Gradle has an edge over Bazel in this scenario.

An important result that is worth elaborating upon is that Gradle is 5-16 times faster than Bazel when executing an incremental change to the monolith. This is because Gradle has two levels to its process when dealing with incremental changes; one of which is very similar to the Bazel process. Both check if a certain action required by a module needs to be rebuilt at all depending on whether any of its input dependencies have changed. However, when Bazel needs to rerun a build action because of changed inputs, the build action for that module rebuilds everything. Gradle, in contrast, has a second layer to its process. With Gradle many build actions are deeply integrated with the toolchain and only what’s needed is rebuilt. For example, the Gradle compile build action only recompiles the source classes within a module that are affected by a change. Bazel recompiles all source classes. Gradle has many other tasks that rebuild incrementally and any Gradle plugin can make use of this, such as the Gradle Android plugin.

The Bazel answer to better incrementality is to break it into smaller modules and possibly rework the dependency graph. There is an appealing simplicity to this approach from the perspective of build system implementation but it comes with a price for end-users in terms of maintainability and applicability. It also reflects a lack of modelling from our point of view. This is discussed in the build authoring and maintenance section.

Distributed Build & Test Support #

Bazel supports a protocol for distributing a single build across multiple machines. Gradle Enterprise offers distributed test execution for Gradle, with Maven compatibility to come. Some CI products also offer functionality to distribute some parts of the build (e.g. parallel stages in Jenkins). In contrast to the build cache, you often need to invest significantly into restructuring your code base to get substantial benefits from distributing your whole build. For most projects, test execution time is the dominant factor of overall build time. Gradle Enterprise test distribution accelerates existing test suites by distributing individual tests, without the need for significant restructuring.

Details of distributed testing and its performance impact will be covered in a separate article.

Build authoring and maintenance #

The performance section above demonstrated the differences in performance between Gradle and Bazel. However, raw performance is only a part of the story. Optimizing Bazel builds for optimal performance is a big topic on its own with significant consequences for build authoring and maintainability. In this section, we’ll dig deeper into these key considerations.

Difference in approaches #

Gradle and Bazel help users to achieve similar goals but they are actually quite different tools in terms of their design and what capabilities they provide.

Gradle aims to provide a complete solution for building software. It is highly optimized for JVM and Android projects, and deeply integrates with the toolchain to provide both performance and convenience. For example, in order to compile Java code, Gradle achieves incremental compilation by determining which classes to recompile based on sophisticated built-in dependency analysis.

Bazel provides an execution engine that delegates to external tools to do the actual work. For example, in the case of incremental compilation, Bazel simply delegates to javac and relies on fine-grained and explicit build files to compile the whole project incrementally.

This fundamental difference in approach between Gradle and Bazel goes well beyond the issue of incremental compilation. It has significant consequences to build authoring and maintenance, and affects a number of issues including build performance, system architecture, the depth of JVM support, and the breadth of support for non-JVM programming languages.

The granularity of build files #

Gradle build files are typically defined on a project level, consisting of multiple packages of both production code and tests and related resources.

Bazel’s build files can be defined with different levels of granularity.

One can use the same approach as Gradle of a single build file per project. This is the pattern we observed in a number of open source projects using Bazel. This approach is most practical for most projects as we will elaborate on shortly. For these reasons, this is the granularity of Bazel build files that we selected for the performance comparisons.

The recommended approach by Bazel for Java projects is to specify build files for each Java package as illustrated below.

Gradle vs Bazel build files

When using this approach in Bazel, the dependencies between packages are declared explicitly in build files. Cyclic dependencies between packages are not allowed and transitive dependencies have to be explicitly declared. This results in not only a large number of build files, but also build files that are verbose and require significant effort to maintain as the codebase evolves.

This recommended granularity of build files has a number of consequences that are important to understand before attempting such setup.

Performance implications #

Package-level granularity is recommended for Bazel for optimal performance. It enables Bazel to perform a number of optimizations including compilation avoidance, parallel execution, and if the required infrastructure is available, also performing the work on remote machines.

Gradle integrates with the Java toolchain much more deeply, allowing optimizations such as incremental compilation and annotation processing to be effective without extra ceremony or structural constraints.

This explains why Gradle has a significant performance advantage in a number of scenarios that we tested. We would likely see improved results with Bazel if we used build file per Java package granularity. However, such an approach is not feasible for most existing projects, as we will explain shortly.

Architectural implications #

As a side note, we consider avoiding cyclic dependencies between packages to be a positive side effect of the Bazel model from the architectural point of view. We enthusiastically encourage well-modularized code bases. This is also important for the remote cache effectiveness. Gradle users can use tools such as ArchUnit or the Java Platform Module System (JPMS) to enforce this.

Migration and maintenance cost implications #

The reality of almost every Java project is that cyclic dependencies exist between packages. The packages are also often not small and have many source classes that have dependencies on each other. Having small, fine-grained modules per package, is not just a question simply of how you configure your build logic; it often requires major refactoring, if not a complete rewrite.

Without seamless automated tooling, fine-grained build files in Bazel builds are very expensive to maintain. Internally at Google, such automation is provided by proprietary tools. For Bazel open source, users need to declare dependencies between packages manually. There have been various initiatives within the Bazel ecosystem to create such tooling for JVM builds. However, our investigation has found that all those projects are either archived or abandoned.

Gradle is fast without fine-grained build files. Fewer build files and less verbose build declarations means reduced maintenance and less ceremony when making changes to the code being built.

JVM vs polyglot support #

The difference in approaches between the tools is also reflected in the depth of support in the JVM domain and the breadth of support of other languages.

Bazel provides an efficient execution engine that delegates to external commands and tools to do the actual work. It makes it easier for Bazel to provide support for more programming languages. As a result, Bazel includes official support for a number of non-JVM programming languages and platforms including C++, Python and Go.

On the other hand, Gradle is not just an efficient execution engine but a full-blown build tool. It offers the deepest level of support for JVM and Android. It makes it much easier to create and maintain high performing builds in these ecosystems. Still, Gradle provides official support for a number of non-JVM languages like C++ and Swift, and community plugins are available for several others including Python and Go.

Extensibility #

Complex builds rarely depend only on the built-in capabilities of the build tool. Typically users depend on extensions, with many being provided by the user community.

Custom plugins / rules #

Gradle and Bazel provide mechanisms to share reusable pieces of custom build logic. The primary being custom plugins and rules respectively. These are also the mechanisms used internally to provide built-in capabilities.

Bazel rules are implemented in Starlark, which is a language inspired by Python. In practice, many Bazel rules delegate to external tools and bash scripts.

Gradle plugins can be implemented in any JVM language (typically Java, Kotlin, or Groovy) and use rich Gradle APIs for common build needs. Using Gradle’s flexible DSL together with custom plugins allows Gradle users to create elegant and declarative build scripts. It can be leveraged for a very broad range of use cases and makes Gradle a powerful tool for even the most challenging integration scenarios.

Build correctness #

One challenge with work avoidance optimizations such as incremental building and build caching is to ensure build correctness. The result of the build should be the same regardless of whether it was a full build or some work was avoided.

All of the steps in a build can be described as actions that produce outputs based on inputs. For example, Java compilation produces class files from source files (and has some additional inputs such as compiler options etc). In order for many optimizations to work correctly, the inputs and outputs must be correctly declared.

Both Gradle and Bazel provide built-in plugins/rules that are extensively tested for correctness. The main difference between the tools in this regard is in how correctness is ensured in custom build logic.

Bazel relies on sandboxing to ensure that all inputs and outputs are correctly declared. If not, the custom rules will simply not work making correctness issues easier to discover. However, sandboxing incurs performance overhead and many Bazel users choose to disable it, especially on MacOS. Sandboxing is also not supported on Windows.

Gradle relies on users to correctly declare inputs and outputs in their custom build logic and plugins. Users can test their custom code with the TestKit library. Many potential problems are caught automatically by the Gradle Plugin Development Plugin. Additionally, Gradle is becoming stricter in this regard. For example, Gradle 6.0 deprecated a number of potentially problematic behaviors that will be completely prevented in the next major version. Users can already opt in to fail their builds because of such issues with --warning-mode=fail flag.

Dependency management #

The vast majority of software projects require external dependencies. Managing them can be a cumbersome and error-prone process. Dependency management is a complex problem and a key feature of a build tool.

Modern JVM projects typically declare the dependencies they require and rely on the build tool to automate the process of obtaining the dependencies (and their dependencies) from an external repository, while managing version conflicts, caching, and many other considerations.

Both Gradle and Bazel provide support for dependency management workflows. Gradle has built-in support for both Maven and Ivy repositories. Bazel provides an official rule with support only for Maven repositories, with some boilerplate code is required to use it. Both tools provide some form of conflict resolution and cache artifacts locally. Gradle provides more advanced transitive dependency management capabilities as explained below.

Dynamic and rich version declarations #

Both Bazel and Gradle support declaring dependencies with static, deterministic, versions.

Gradle also supports version ranges and dynamic versions to support various binary integration scenarios. In addition, rich version declarations allow defining dependency versions more precisely so that Gradle can make a better decision on which version to select and ultimately help users to avoid various dependency issues. Users can lock version ranges and dynamic versions to ensure reproducible builds.

Transitive dependency management #

Transitive dependency management is a complex topic and a simple version conflict resolution strategy is not enough. Small changes in declared dependencies often lead to unexpected compilation or even runtime errors as the transitive dependencies change.

Some of the common transitive dependency management issues that Gradle provides first-class solutions for are:

A dependency version needs to be downgraded because the project requires an older version of an external dependency than what is brought transitively and the default resolution strategy of selecting the newest version is not appropriate.
Related dependencies need to be aligned on the same version to work together correctly (for example different modules of the popular Jackson library).
Dependencies are mutually exclusive because they provide the same capabilities (for example multiple logger implementations).
Dependencies provide different variants that are optimized for a specific Java version (for example the Guava library with its Android and JRE flavors)
Dependencies provide wrong metadata.
Dependencies require additional dependencies only for certain features.

In Bazel, none of the above use cases are supported. Users need to rely on explicit dependency declarations and excluding problematic versions to try to solve the issues above. For projects with non-trivial dependency graphs, this is expensive and hard to maintain. As the codebase evolves it will frequently lead to build and test failures that are hard to debug. If the built software is a library, it also pushes solving the same problems to consumers of the library.

Other Features #

Build & Test Facilities #

Gradle provides many conveniences for building and testing Java & JVM projects. Bazel currently lacks built-in constructs for basic operations in JVM projects.

For example, Bazel users often rely on custom Python or shell scripts and handcrafted pom.xml files for publishing artifacts. In contrast, Gradle provides built-in support for publishing with a high-level DSL and reasonable defaults.

Testing in Bazel is also more involved as there is no test detection. Instead, users have to explicitly declare test classes or test suites to compile and run. To run multiple tests, Bazel users often rely on a custom step in their build to generate JUnit test suites. In contrast, Gradle runs all tests that it detects by default. Additionally, JUnit 5 is not officially supported in Bazel at this time while in Gradle it can be enabled with a single line of declarative code.

IDE Support #

Gradle is supported out of the box in Intellij IDEA and an official plugin exists for Eclipse. Red Hat provides support for Java including Gradle support in Visual Studio Code. Bazel provides an official plugin only for IDEA. Currently, the Bazel support for IDEA is on a “best-effort” basis for macOS and it is not supported on Windows. The official Eclipse support for Bazel is unmaintained.

Spring Boot Support #

Spring Boot Gradle Plugin is maintained by Spring Team. At this time only unofficial Spring Boot rules and examples exist for Bazel.

Kotlin Support #

Kotlin plugin for Gradle is officially supported and maintained by JetBrains and the vast majority of Kotlin projects use Gradle. Bazel provides official build rules for Kotlin. Kotlin Multiplatform makes use of the built-in variant-aware dependency management that Gradle offers.

Popularity #

The Gradle Build Tool is the most popular build tool for open-source JVM projects on GitHub. Most Java developers use it regularly according to the State of Developer Ecosystem 2019 survey by Jetbrains. Gradle Build Tool downloads have doubled roughly every year since 2013 and it is now downloaded more than 15 million times per month. The Gradle Build Tool has a significantly larger plugin ecosystem (>4300 Gradle community plugins vs >120 Bazel rules maintained by either Google or the community).

Summary #

Both Gradle and Bazel are fast build systems with significant advantages compared to Maven, with Gradle Enterprise for Maven narrowing that gap. While we see a lot of merit in Bazel’s approach, particularly for extremely large monorepos like the one at Google, the analysis above suggests that Gradle is a better choice than Bazel for most JVM projects. Here is a summary of the key reasons:

Gradle is faster than Bazel in almost all of the projects and scenarios measured including large monolith projects consisting of millions lines of code. This performance advantage arises from a number of factors including a deep integration with the Java toolchain.
Optimizing projects for Bazel comes at a significant cost for build authoring and maintenance due to fine-grained and verbose build file declarations, which is not the case with Gradle. Taking full advantage of Bazel’s performance optimizations is impractical for most existing projects, especially without additional tooling that does not exist for JVM builds.
Gradle provides more compelling features and conveniences for common use cases in JVM projects, including critical capabilities for dependency management and tool integration. Bazel currently lacks support for basic capabilities expected from a JVM build tool such as the ability to easily run all tests or publish artifacts.

Whether you are using Gradle or Bazel to improve your developer productivity, choosing the faster build tool is not enough if not combined with a dedicated and continuous effort to keep your toolchain fast and reliable. This is why we recommend embracing the discipline of Developer Productivity Engineering and a culture of developer productivity at all levels.

Gradle vs Bazel for JVM Projects

Table of Contents

Introduction

Introduction #

Performance #

About the Performance Comparison #

Performance Measurement Scenarios #

Scenario 1: Full Build #

Scenario 2: Remote Caching #

Scenario 3: Incremental builds #

Distributed Build & Test Support #

Build authoring and maintenance #

Difference in approaches #

The granularity of build files #

Performance implications #

Architectural implications #

Migration and maintenance cost implications #

JVM vs polyglot support #

Extensibility #

Custom plugins / rules #

Build correctness #

Dependency management #

Dynamic and rich version declarations #

Transitive dependency management #

Other Features #

Build & Test Facilities #

IDE Support #

Spring Boot Support #

Kotlin Support #

Popularity #

Summary #

Discuss

Gradle vs Bazel for JVM Projects

Table of Contents

Introduction

Introduction #

Performance #

About the Performance Comparison #

Performance Measurement Scenarios #

Scenario 1: Full Build #

Scenario 2: Remote Caching #

Scenario 3: Incremental builds #

Distributed Build & Test Support #

Build authoring and maintenance #

Difference in approaches #

The granularity of build files #

Performance implications #

Architectural implications #

Migration and maintenance cost implications #

JVM vs polyglot support #

Extensibility #

Custom plugins / rules #

Build correctness #

Dependency management #

Dynamic and rich version declarations #

Transitive dependency management #

Other Features #

Build & Test Facilities #

IDE Support #

Spring Boot Support #

Kotlin Support #

Popularity #

Summary #

Discuss

Related Posts