Skip to content

[GPU] fixup matmul ref implementation#4958

Open
dyoussif wants to merge 4 commits intomainfrom
dyoussif/gemm_ref_scales
Open

[GPU] fixup matmul ref implementation#4958
dyoussif wants to merge 4 commits intomainfrom
dyoussif/gemm_ref_scales

Conversation

@dyoussif
Copy link
Copy Markdown
Contributor

@dyoussif dyoussif commented Apr 6, 2026

closes MFDNN-14852
MFDNN-14853

  1. Ref implementation incorrectly uses dst layout for dynamic scales layout; scales layout should always be plain.
  2. Ref gemm does not properly support runtime dims; defer to ref matmul.
  3. jit gemm expects plain layout for dst when applying dynamic scales

@dyoussif dyoussif requested review from a team as code owners April 6, 2026 22:31
@github-actions github-actions bot added platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel component:tests Codeowner: @oneapi-src/onednn-arch labels Apr 6, 2026
@dyoussif dyoussif changed the title Dyoussif/gemm ref scales [GPU] fixup matmul ref implementation Apr 6, 2026
@dyoussif dyoussif force-pushed the dyoussif/gemm_ref_scales branch from a3665fa to b1d324b Compare April 6, 2026 22:37
@dyoussif
Copy link
Copy Markdown
Contributor Author

dyoussif commented Apr 6, 2026

make test
set test_scope=NIGHTLY
disable test_device_cpu
disable benchdnn_all
enable benchdnn_matmul
enable benchdnn_ip

c_stride_d3);
#else
#if NDIMS == 5
scale_off = DST_SCALE_OFF(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like DST_SCALE_OFF is only used here so either we should fix that macro or drop it from src/gpu/intel/include/types_specific.h.

scale_off = scale_off_dst(d0 % DST_D0, m, n, groupSize);
#else
scale_off = DST_SCALE_OFF(m, n, 0, 0, 0, groupSize, 1);
scale_off = scale_off_dst(m, n, groupSize);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we could normalize these calls to scale_off = scale_off_dst(n, m, d0 % DST_D0, d1 %DST_D1, ...) and consolidate all the #ifdef logic at the function definition.

!utils::one_of(DNNL_RUNTIME_DIM_VAL, desc()->m(),
desc()->n(), desc()->k(), desc()->lda(),
desc()->ldb(), desc()->ldc(), desc()->batch()),
VERBOSE_RUNTIMEDIM_UNSUPPORTED);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment: The OpenCL kernel looks like it supports runtime dimensions, so we are likely just missing some logic in the execute function, it might be good to explain why were are disabling it here (in general, I think ref_gemm should just be removed, but RNN relies on it as I recall).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component:tests Codeowner: @oneapi-src/onednn-arch platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants