AI Accelerator Benchmarking

What We Offer

Three Ways to Work with Substrate Co

Each service is scoped to a specific stage of the benchmarking process. Pick the one that fits where your team is today.

Workload Benchmark Setup

A guided session to help your team define fair, repeatable benchmarks for your own AI workloads. Methods and assumptions are documented openly, so results reflect real use.

One structured session with your engineers
Benchmark plan with documented assumptions
Reporting outline for future comparisons

RM 620 Enquire

Comparative Benchmark Study

Runs your defined benchmarks across candidate configurations and presents results side by side. We avoid declaring a single winner — the data informs the decision, not us.

Two-phase delivery with written report
Results dataset with clear caveats
Method note explaining each figure

RM 1,510 Enquire

Benchmarking Advisory Retainer

An ongoing arrangement to help your team maintain and refresh its benchmarking practice as workloads and hardware evolve. We act as an independent advisor, not a vendor.

Three-month engagement with scheduled reviews
Living method library maintained over time
Review notes after each assessment cycle

RM 3,020 Enquire

Why It Matters

What Sets Our Approach Apart

No Commercial Stake

We do not sell hardware, cloud capacity, or software. Our findings are not shaped by vendor relationships — only by the method and the data.

Documented Methods

Every figure comes with a method note. Assumptions, warm-up procedures, and edge conditions are written down so you can reproduce or challenge the results.

Workload-First Design

Benchmarks are built around your actual inference or training tasks — not synthetic suites designed for press releases.

Side-by-Side Presentation

Results are shown in a comparison frame that lets your team read figures directly. No summary narrative that pre-decides the outcome for you.

Repeatable Process

Benchmark templates are yours to keep. Re-run them in six months when a new option appears and compare apples to apples.

Plain Communication

Reports are written in language an engineering lead can read directly. We do not pad deliverables with charts that obscure rather than inform.

Ready to Start

Have a Benchmarking Question?

Whether you are at the early stage of defining what to measure, or already have results that need a second set of eyes, we are glad to talk through the specifics with you.

Request a Quote or call +60 3-2166 5392

Hardware We Benchmark

NVIDIA Accelerators & Our Benchmarking Approach

NVIDIA GPU Evaluation

NVIDIA's accelerator portfolio — from the H100 and H200 in data centre deployments to the L40S for inference-at-scale and the RTX series for on-premise workloads — covers a wide performance and cost range. Vendor specifications describe peak throughput under idealised conditions. What those figures mean for a specific team's inference pipeline or training loop is a separate question entirely.

Substrate Co designs benchmarks around the actual tasks: the model architecture, batch configuration, precision format, and memory footprint your workload uses in practice. We run against the hardware you are evaluating and document every condition that could affect how the number holds up in production.

H100 / H200

Data centre training & large-batch inference benchmarks

L40S / A100

Multi-model inference and throughput comparison studies

RTX Series

On-premise and edge deployment evaluation

Cross-vendor

NVIDIA vs AMD vs custom ASIC side-by-side studies

Our AI Benchmarking Tooling

Substrate Co has developed an internal measurement stack built specifically for AI workload evaluation. Rather than running standard suite tests that hardware vendors optimise for, our tooling measures the performance characteristics that matter for real deployment decisions: latency percentiles under load, throughput degradation at varying batch sizes, memory bandwidth saturation, and precision-mode trade-offs across FP16, BF16, and INT8.

Every measurement is paired with a method note that documents the run conditions. If the number is only valid at a particular batch size or driver version, that is stated explicitly. The goal is results your team can stand behind in a procurement review — not figures that look strong in a summary slide but fall apart under questions.

Latency & throughput measurement across concurrency levels
Precision format benchmarking: FP32, FP16, BF16, INT8
Memory bandwidth and VRAM utilisation profiling
Multi-GPU scaling and NVLink / PCIe topology tests
Framework-level profiling: PyTorch, TensorRT, vLLM, TGI
Results dataset provided in CSV alongside the written report

Why method notes matter for AI hardware decisions

An AI accelerator benchmark without a method note is a number without a context. Factors like driver version, CUDA toolkit, inference server configuration, and batch size can shift throughput figures by 20–40% between otherwise identical runs. Every Substrate Co deliverable includes a method note covering exactly these conditions — so your team knows not just what the number is, but when it holds and when it does not.

Discuss your evaluation needs →

Common Questions

Frequently Asked

What does an AI workload benchmark actually measure?

A workload benchmark measures how a specific hardware configuration performs on tasks you actually run — such as a particular model size at a given batch size and precision. It records throughput, latency percentiles, memory utilisation, and power draw under conditions you define. The goal is a figure you can reproduce and compare, not a marketing headline.

How long does the Comparative Benchmark Study take?

The study runs across two phases. Phase one covers setup and a dry run with one configuration; phase two covers the remaining candidates and report writing. Elapsed time depends on how many configurations you are evaluating and how quickly we can access them, but most engagements complete within two to three weeks of the kick-off call.

Do we need to provide hardware access?

For the Comparative Benchmark Study and the Advisory Retainer, yes — we run benchmarks against the specific configurations you are evaluating, which means we need either direct access or a method to submit jobs. We discuss access arrangements during the scoping call and can work within most secure environments. For the Benchmark Setup service, hardware access is not required during the session itself.

Will the report tell us which option to choose?

We present data with caveats and context — not a verdict. Procurement decisions involve factors beyond benchmark numbers: budget, support terms, roadmap, and team familiarity. Our role is to give you a reliable data layer, not to remove the judgment call from your team.

What is included in the Advisory Retainer?

The retainer runs for three months and includes scheduled review sessions, updates to your method library when workloads or hardware options change, and written notes after each review. The idea is that benchmarking stays current rather than going stale between procurement cycles. Pricing is RM 3,020 for the full three-month engagement.

How do we get started?

Fill in the enquiry form below or call us directly. We will arrange a short scoping call — typically 30 minutes — to understand your workloads and evaluation timeline, then send a clear proposal with deliverables and timings.

Our Location

Find Us in Kuala Lumpur

Unit B-7-3, Megan Avenue II, 12 Jalan Yap Kwan Seng, 50450 Kuala Lumpur

Get in Touch

Send an Enquiry

Contact Details

Phone

+60 3-2166 5392

Email

[email protected]

Address

Unit B-7-3, Megan Avenue II
12 Jalan Yap Kwan Seng
50450 Kuala Lumpur, Malaysia

Working Hours

Monday – Friday: 9:00 AM – 6:00 PM MYT
Saturday: 10:00 AM – 2:00 PM MYT
Sunday: Closed

Note: For initial enquiries, a brief description of your AI workloads and the hardware options you are considering helps us prepare a more useful first response.

Measure What Your AI Workloads Actually Need