benchdnn: prohibit --mode=CP by kwieloch-intel · Pull Request #4950 · uxlfoundation/oneDNN

kwieloch-intel · 2026-04-03T10:25:55Z

This PR prohibits `--mode=CP` in benchdnn and auto-enables `no_ref_memory` for `--mode=P` on Intel GPUs to align performance data filling with `--mode=F`.

JIRA: MFDNN-14789

Problem description

Two issues with benchdnn performance mode data filling on Intel GPUs:

--mode=CP is broken: correctness + performance combined mode is unused and currently broken on both CPU and GPU. More importantly, correctness validation requires specific low-range data filling (less randomness in mantissa to minimize round-off errors), which makes the data compressible by the GPU driver and produces unreliable performance numbers.
--mode=P vs --mode=F data mismatch: in --mode=P, fill_random_real_dense() overwrites GPU buffers with deterministic CPU-generated data via reorder(), replacing the incompressible Philox PRNG data written by gpu_fill_random() during dnn_mem_t::initialize(). This leads to different (and potentially compressible) data patterns compared to --mode=F.

Proposed Solution

Two changes in engine_t constructor (dnnl_common.cpp):

Prohibit --mode=CP: if both corr and perf mode bits are set, emit an error message and fail. Users should run --mode=C and --mode=P (or --mode=F) separately.
Auto-enable no_ref_memory for --mode=P on Intel GPUs: when running perf-only mode (perf && !corr) with an Intel GPU engine, automatically set the no_ref_memory modifier. This skips CPU reference memory allocation and fill_random_real_dense() calls, so GPU buffers retain incompressible Philox PRNG data, aligning --mode=P data filling with --mode=F.

For CPU, --mode=F uses 0x3F memset (different from --mode=P which uses fill_random_real_dense), so the existing behavior is intentionally preserved, no change for CPU.

Example

Before: --mode=CP silently runs but produces NaN in correctness output (broken) and unreliable perf numbers:

> benchdnn --mode=CP --matmul --engine=gpu --dt=f16 128x128:128x128

[   0][DST][0:0] exp_f32:     2438.02 exp:        2438 got:        -nan diff:     nan rdiff:     nan
[   1][DST][0:1] exp_f32:     115.184 exp:     115.188 got:        -nan diff:     nan rdiff:     nan
[   2][DST][0:2] exp_f32:     258.576 exp:       258.5 got:        -nan diff:     nan rdiff:     nan
...
perf,gpu,jit:gemm:any,,--mode=CP --matmul --engine=gpu --dt=f16:f16:f16 128x128:128x128,0.0041943,...
tests:1 passed:1 skipped:0 ... failed:0

After: --mode=CP is explicitly prohibited:

> benchdnn --mode=CP --matmul --engine=gpu --dt=f16 128x128:128x128

Error: --mode=CP is not supported. Use --mode=C and --mode=P (or --mode=F) separately.

dzarukin · 2026-04-03T16:21:26Z

tests/benchdnn/dnnl_common.cpp

+            && !has_bench_mode_bit(mode_bit_t::corr)) {
+        bench_mode_modifier |= mode_modifier_t::no_ref_memory;
+    }
+#endif


All those parts to be handled in utils/parser.cpp, not here.

fbfda9c I’ve moved it to utils/parser.cpp.

dzarukin · 2026-04-07T17:00:20Z

tests/benchdnn/utils/parser.cpp

+            && !has_bench_mode_bit(mode_bit_t::corr)) {
+        bench_mode_modifier |= mode_modifier_t::no_ref_memory;
+    }
+#endif


I'm sorry I was not quite clear in my intentions. There's already logic handling this: https://github.com/uxlfoundation/oneDNN/blob/main/tests/benchdnn/utils/parser.cpp#L1568

In this function you'll find places for both updates you want to do.

kwieloch-intel added 4 commits March 19, 2026 10:27

benchdnn: skip reorder in fill_random_real_dense

6a0db71

benchdnn: refill in measure_perf

6778143

benchdnn: prohibit CP and auto-set no_ref_memory for P + Intel GPU

ad29396

benchdnn: prohibit PC/CP

caa9a7d

github-actions bot added the component:tests Codeowner: @oneapi-src/onednn-arch label Apr 3, 2026

kwieloch-intel changed the title ~~benchdnn: prohibits --mode=CP~~ benchdnn: prohibit --mode=CP Apr 3, 2026

benchdnn: clang format fix

02d5230

dzarukin reviewed Apr 3, 2026

View reviewed changes

kwieloch-intel added 2 commits April 7, 2026 13:21

benchdnn: move logic to parser.cpp

fbfda9c

benchdnn: remove unnecessary newline

cb74be8

dzarukin reviewed Apr 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchdnn: prohibit --mode=CP#4950

benchdnn: prohibit --mode=CP#4950
kwieloch-intel wants to merge 7 commits intouxlfoundation:mainfrom
kwieloch-intel:skip-fill-mode-p

kwieloch-intel commented Apr 3, 2026 •

edited

Loading

Uh oh!

dzarukin Apr 3, 2026

Uh oh!

kwieloch-intel Apr 7, 2026

Uh oh!

dzarukin Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kwieloch-intel commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

This PR prohibits --mode=CP in benchdnn and auto-enables no_ref_memory for --mode=P on Intel GPUs to align performance data filling with --mode=F.

Problem description

Proposed Solution

Example

Uh oh!

dzarukin Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

kwieloch-intel Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

dzarukin Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kwieloch-intel commented Apr 3, 2026 •

edited

Loading

This PR prohibits `--mode=CP` in benchdnn and auto-enables `no_ref_memory` for `--mode=P` on Intel GPUs to align performance data filling with `--mode=F`.