Chapter 16: Large models — GPU acceleration using OpenCL

Introduction

GPU acceleration is an optional feature of glmbayes. All modeling functions — glmb(), lmb(), rglmb(), and related tools — run fully on the CPU regardless of whether OpenCL is available. No setup is needed for standard use.

Where GPU acceleration pays off is with large models: high-dimensional predictor sets or large posterior sample sizes. The computationally intensive work in glmbayes is envelope construction and evaluation — the gradient and log-posterior calculations at each point of the tangency grid grow with model dimension and are embarrassingly parallel. Dispatching them to a GPU with use_opencl = TRUE can substantially reduce wall time for these cases. See Chapter A10 for a technical explanation of what is accelerated and why.

This chapter describes how to enable GPU acceleration. The process closely resembles a source install of any compiled R package: the only extra step is ensuring that the ‘opencltools’ dependency is in an OpenCL-ready state before the source install.

What you see when you load glmbayes

When glmbayes is loaded in an interactive session it checks, silently, whether GPU acceleration appears feasible. If has_opencl() is already TRUE — meaning this build was compiled with OpenCL support — attach is completely silent.

If has_opencl() is FALSE and the package detects a GPU or OpenCL stack on the host, you will see a message like:

Note: glmbayes provides full CPU capability in this session
(e.g. glmb(), lmb(), Prior_Setup()). GPU acceleration is recommended
for bigger models and appears available. Reinstall glmbayes from source
with OpenCL at compile time to enable it; see vignette("Chapter-16",
"glmbayes") for install instructions.

On a machine with no GPU and no OpenCL stack, attach is silent — the CPU-only install is entirely appropriate and no action is needed.

To suppress the message in scripts or automated workflows:

options(glmbayes.quiet_opencl_startup = TRUE)

Enabling GPU acceleration: three steps

Work through these steps in order. After each step you can check whether you are done and skip the rest.

Step 1: Check whether OpenCL is already enabled

library(glmbayes)
has_opencl()

If this returns TRUE, GPU acceleration is already compiled in. Pass use_opencl = TRUE to glmb() and you are done. Otherwise continue to Step 2.

Step 2: Ensure ‘opencltools’ is OpenCL-ready

opencltools is installed automatically as a dependency of glmbayes. It provides the host diagnostics and runtime checks that glmbayes relies on. For GPU acceleration to work in glmbayes, opencltools must itself be built with OpenCL support.

Check:

opencltools::has_opencl()

If this returns FALSE, follow vignette("Chapter-01", package = "opencltools") to install the required OpenCL components (GPU driver, headers, ICD loader) for your platform and reinstall opencltools from source. The opencltools Chapter 01 vignette is the maintained home for per-OS installation instructions and keeps them current.

For a host-level diagnostic that does not depend on the glmbayes build state:

opencltools::diagnose_glmbayes()

Once opencltools::has_opencl() returns TRUE, proceed to Step 3.

What you need on your system (brief summary; details in ‘opencltools’ Chapter 01):

Component What it provides Needed for
GPU driver Exposes hardware to the OS Runtime
OpenCL headers (CL/cl.h) Required at compile time Source build
OpenCL ICD loader (OpenCL.dll / libOpenCL.so) Dispatches to vendor runtime Runtime

All three must be present. The most common failure mode is having the driver but not the headers, or the headers but not the ICD loader.

Step 3: Reinstall glmbayes from source

With the OpenCL environment confirmed, reinstall glmbayes from source. The configure / configure.win script runs automatically, detects the OpenCL headers and library, and sets -DUSE_OPENCL if everything is found.

Windows

Windows users typically need devtools (or remotes) for source installs. Install it first if you do not have it:

install.packages("devtools")

Then install glmbayes from source. From CRAN with source compilation:

install.packages("glmbayes", type = "source")

Or from GitHub if you need a development version:

devtools::install_github("knygren/glmbayes")

Rtools must be installed and on your PATH. If you have not yet installed Rtools, follow the prompt at https://cran.r-project.org/bin/windows/Rtools/.

Linux / macOS

install.packages("glmbayes", type = "source")

On macOS, Xcode Command Line Tools and GCC (via Homebrew) are required; see vignette("Chapter-01", package = "opencltools") for details.

After the install

Confirm the build succeeded:

library(glmbayes)
has_opencl()
#> [1] TRUE

Verifying the setup

Once has_opencl() returns TRUE, run a full diagnostic to confirm the complete stack:

diagnose_glmbayes()

A clean report looks like:

=== glmbayes OpenCL Diagnostic Report ===
Environment: linux

GPU: NVIDIA
  [OK] Driver installed
  [OK] OpenCL headers found (CL/cl.h)
  [OK] OpenCL runtime found (OpenCL.dll / ICD)
  [OK] OpenCL fully available (headers + runtime)
  [OK] Required PATH and library dirs present
  [OK] OpenCL runtime probe succeeded (platform available)

[OK] glmbayes was compiled with OpenCL support.

=== End of Diagnostic Report ===

Each line reports one layer of the stack. If any line shows [FAIL] or [WARN], the report indicates what is missing. Common resolutions:

On Windows, the Linux/WSL runtime probe step is skipped; rely on the driver and ICD checks instead.

For PATH-related warnings on Windows (CUDA Toolkit bin directory not in PATH), the diagnostic report lists the missing entries. Fix them via system settings or your shell profile; advanced users may use the helpers in opencltools directly (see ?opencltools::add_to_path).

Running a GPU-accelerated model

Once set up, pass use_opencl = TRUE to glmb() or rglmb():

example(Cleveland)

The built-in Cleveland example runs a CPU vs OpenCL comparison and is a convenient end-to-end test. The chunks below illustrate the pattern (not executed during the vignette build):

library(glmbayes)
data("Cleveland")

ps <- Prior_Setup(
  hd ~ age + sex + cp + trestbps + chol +
    fbs + restecg + thalach + exang + oldpeak + slope + ca + thal,
  family = binomial(logit),
  data = Cleveland
)

t_cpu <- system.time({
  fit_cpu <- glmb(
    hd ~ age + sex + cp + trestbps + chol +
      fbs + restecg + thalach + exang + oldpeak + slope + ca + thal,
    family       = binomial(link = "logit"),
    pfamily      = dNormal(mu = ps$mu, Sigma = ps$Sigma),
    data         = Cleveland,
    n            = 20000,
    Gridtype     = 2,
    use_parallel = TRUE,
    use_opencl   = FALSE,
    verbose      = FALSE
  )
})

t_gpu <- system.time({
  fit_gpu <- glmb(
    hd ~ age + sex + cp + trestbps + chol +
      fbs + restecg + thalach + exang + oldpeak + slope + ca + thal,
    family       = binomial(link = "logit"),
    pfamily      = dNormal(mu = ps$mu, Sigma = ps$Sigma),
    data         = Cleveland,
    n            = 20000,
    Gridtype     = 2,
    use_parallel = TRUE,
    use_opencl   = TRUE,
    verbose      = FALSE
  )
})

t_cpu
t_gpu

summary(fit_gpu)

The GPU path gives the same posterior results as the CPU path; only timing differs. GPU gains are most visible with larger models (more predictors, larger n, higher-dimensional tangency grids).

Appendix A: AMD GPUs on Linux (ROCm OpenCL)

AMD provides multiple OpenCL implementations on Linux, but only ROCm OpenCL is fully supported and stable. If you are using an AMD GPU, install ROCm OpenCL on Ubuntu 22.04 or 24.04 LTS:

sudo apt-get install rocm-opencl-runtime

This installs the AMD OpenCL runtime, the ICD file (amdocl64.icd), and ROCm’s optimized OpenCL implementation.

Supported AMD GPUs (ROCm):

Older GPUs (Polaris, Vega, Navi 1x/2x) are not supported by ROCm. Mesa Rusticl is a community alternative that may work but is not officially supported. AMDGPU-PRO OpenCL is legacy and not recommended.

For full per-distribution instructions and verification steps, see vignette("Chapter-01", package = "opencltools").