Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiplication of small matrices errors due to scalar indexing #712

Open
simeonschaub opened this issue Dec 23, 2024 · 6 comments
Open

multiplication of small matrices errors due to scalar indexing #712

simeonschaub opened this issue Dec 23, 2024 · 6 comments

Comments

@simeonschaub
Copy link

julia> AMDGPU.rand(3, 3) * AMDGPU.rand(3, 3)
ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.

If you want to allow scalar iteration, use `allowscalar` or `@allowscalar`
to enable scalar iteration globally or for the operations in question.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:44
  [2] errorscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/aNaXo/src/GPUArraysCore.jl:151
  [3] _assertscalar(op::String, behavior::GPUArraysCore.ScalarIndexing)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/aNaXo/src/GPUArraysCore.jl:124
  [4] assertscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/aNaXo/src/GPUArraysCore.jl:112
  [5] getindex
    @ ~/.julia/packages/GPUArrays/sBzM5/src/host/indexing.jl:50 [inlined]
  [6] scalar_getindex
    @ ~/.julia/packages/GPUArrays/sBzM5/src/host/indexing.jl:36 [inlined]
  [7] _getindex
    @ ~/.julia/packages/GPUArrays/sBzM5/src/host/indexing.jl:19 [inlined]
  [8] getindex
    @ ~/.julia/packages/GPUArrays/sBzM5/src/host/indexing.jl:17 [inlined]
  [9] __matmul3x3_elements(tA::Char, A::ROCArray{Float32, 2, AMDGPU.Runtime.Mem.HIPBuffer})
    @ LinearAlgebra /julia/usr/share/julia/stdlib/v1.12/LinearAlgebra/src/matmul.jl:1138
 [10] __matmul3x3_elements
    @ /julia/usr/share/julia/stdlib/v1.12/LinearAlgebra/src/matmul.jl:1175 [inlined]
 [11] _matmul3x3_elements
    @ /julia/usr/share/julia/stdlib/v1.12/LinearAlgebra/src/matmul.jl:1132 [inlined]
 [12] matmul2x2or3x3_nonzeroalpha!(C::ROCArray{Float32, 2, AMDGPU.Runtime.Mem.HIPBuffer}, tA::Char, tB::Char, A::ROCArray{Float32, 2, AMDGPU.Runtime.Mem.HIPBuffer}, B::ROCArray{Float32, 2, AMDGPU.Runtime.Mem.HIPBuffer}, α::Bool, β::Bool)
    @ LinearAlgebra /julia/usr/share/julia/stdlib/v1.12/LinearAlgebra/src/matmul.jl:431
 [13] generic_matmatmul_wrapper!(C::ROCArray{…}, tA::Char, tB::Char, A::ROCArray{…}, B::ROCArray{…}, α::Bool, β::Bool, val::Val{…})
    @ LinearAlgebra /julia/usr/share/julia/stdlib/v1.12/LinearAlgebra/src/matmul.jl:449
 [14] _mul!
    @ /julia/usr/share/julia/stdlib/v1.12/LinearAlgebra/src/matmul.jl:326 [inlined]
 [15] mul!
    @ /julia/usr/share/julia/stdlib/v1.12/LinearAlgebra/src/matmul.jl:295 [inlined]
 [16] mul!
    @ /julia/usr/share/julia/stdlib/v1.12/LinearAlgebra/src/matmul.jl:263 [inlined]
 [17] *(A::ROCArray{Float32, 2, AMDGPU.Runtime.Mem.HIPBuffer}, B::ROCArray{Float32, 2, AMDGPU.Runtime.Mem.HIPBuffer})
    @ LinearAlgebra /julia/usr/share/julia/stdlib/v1.12/LinearAlgebra/src/matmul.jl:134
 [18] top-level scope
    @ REPL[21]:1
Some type information was truncated. Use `show(err)` to see complete types.

This is on Julia nightly, but built with LLVM 17, since 18 doesn't seem to be supported by AMDGPU.jl yet. I wasn't able to test on Julia 1.11 due to LLVM 16 not supporting gfx1100 APUs.

julia> versioninfo()
Julia Version 1.12.0-DEV.1789
Commit 083b24eaa4* (2024-12-20 19:38 UTC)
Build Info:
  DEBUG build
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 16 × AMD Ryzen AI 7 PRO 360 w/ Radeon 880M
  WORD_SIZE: 64
  LLVM: libLLVM-17.0.6 (ORCJIT, generic)
Threads: 1 default, 0 interactive, 1 GC (on 16 virtual cores)
Environment:
  LD_LIBRARY_PATH = /opt/ompi/lib:/opt/rocm/lib:/usr/local/lib:

julia> AMDGPU.versioninfo()
[ Info: AMDGPU versioninfo
┌───────────┬──────────────────┬───────────┬────────────────────────────────────────────────────────────────────────────────┐
│ Available │ Name             │ Version   │ Path                                                                           │
├───────────┼──────────────────┼───────────┼────────────────────────────────────────────────────────────────────────────────┤
│     +     │ LLD              │ -         │ /opt/rocm/llvm/bin/ld.lld                                                      │
│     +     │ Device Libraries │ -         │ /root/.julia/artifacts/5ad5ecb46e3c334821f54c1feecc6c152b7b6a45/amdgcn/bitcode │
│     +     │ HIP              │ 6.3.42131 │ /opt/rocm/lib/libamdhip64.so                                                   │
│     +     │ rocBLAS          │ 4.3.0     │ /opt/rocm/lib/librocblas.so                                                    │
│     +     │ rocSOLVER        │ 3.27.0    │ /opt/rocm/lib/librocsolver.so                                                  │
│     +     │ rocSPARSE        │ -         │ /opt/rocm/lib/librocsparse.so                                                  │
│     +     │ rocRAND          │ 2.10.5    │ /opt/rocm/lib/librocrand.so                                                    │
│     +     │ rocFFT           │ 1.0.27    │ /opt/rocm/lib/librocfft.so                                                     │
│     +     │ MIOpen           │ 3.3.0     │ /opt/rocm/lib/libMIOpen.so                                                     │
└───────────┴──────────────────┴───────────┴────────────────────────────────────────────────────────────────────────────────┘

[ Info: AMDGPU devices
┌────┬─────────────────────┬──────────┬───────────┬────────────┬────────────────
│ Id │                Name │ GCN arch │ Wavefront │     Memory │ Shared Memory ⋯
├────┼─────────────────────┼──────────┼───────────┼────────────┼────────────────
│  1 │ AMD Radeon Graphics │  gfx1100 │        32 │ 15.244 GiB │    64.000 KiB ⋯
└────┴─────────────────────┴──────────┴───────────┴────────────┴────────────────

(I know there is no official upstream support for these APUs, but this seems like an orthogonal issue)

@pxl-th
Copy link
Member

pxl-th commented Dec 23, 2024

Looks like it is dispatching onto the wrong path.
Maybe something needs to be updated for 1.12, I haven't looked into it yet.

Also, you may want to try disabling artifact device libraries with JULIA_LLVM_ARGS="-opaque-pointers", because the ones it is currently using are from ROCm 5.5 (I think) which are quite old and may also lack support for your APU.
We use device libraries for native Julia kernels, so you can test it with things like .*.

@simeonschaub
Copy link
Author

Looks like it is dispatching onto the wrong path. Maybe something needs to be updated for 1.12, I haven't looked into it yet.

Also, you may want to try disabling artifact device libraries with JULIA_LLVM_ARGS="-opaque-pointers", because the ones it is currently using are from ROCm 5.5 (I think) which are quite old and may also lack support for your APU. We use device libraries for native Julia kernels, so you can test it with things like .*.

That doesn't seem to work anymore on nightly, is there another way to disable artifacts?

# JULIA_LLVM_ARGS="-opaque-pointers" /julia/julia --project=/amdgpu/
julia: Unknown command line argument '-opaque-pointers'.  Try: 'julia --help'

Trying the following fails due to incompatible llvm versions:

julia> ENV["JULIA_LLVM_ARGS"] = "-opaque-pointers"
"-opaque-pointers"

julia> using AMDGPU

julia> AMDGPU.rand(10, 10) .* AMDGPU.rand(10, 10)
error: Invalid record (Producer: 'LLVM18.0.0git' Reader: 'LLVM 17.0.6jl')

Without that setting pretty much everything works fine though as long as I set HSA_OVERRIDE_GFX_VERSION="11.0.0". The only thing I'm still having trouble with is MIOpen sometimes crashing, will investigate

@pxl-th
Copy link
Member

pxl-th commented Dec 23, 2024

That doesn't seem to work anymore on nightly, is there another way to disable artifacts?

Not yet, needs changes in the code. Back then no Julia were using them by default.

Trying the following fails due to incompatible llvm versions:

I was planning to use LLVM downgrader on device libraries to match Julia' LLVM version. That should hopefully fix these issues.

Without that setting pretty much eveything works fine though as long as I set HSA_OVERRIDE_GFX_VERSION="11.0.0".

We can make it set this automatically, similar to how we did it for gfx103x gpus.
What gfx version does your gpu have, is it gfx1103?

@simeonschaub
Copy link
Author

It's actually gfx1150, but I override it to gfx1100 because rocBLAS only seems to support the latter

@pxl-th
Copy link
Member

pxl-th commented Dec 23, 2024

Can you show the output of /sys/class/kfd/kfd/topology/nodes/<gpu-id>/name?
This way we can use this name to override GFX before initializing HIP.

@simeonschaub
Copy link
Author

Hmm, doesn't seem to work (note this is inside a docker container - result is the same when run locally though):

root@00e6c168dc99:~# cat /sys/class/kfd/kfd/topology/nodes/0/name

root@00e6c168dc99:~# cat /sys/class/kfd/kfd/topology/nodes/1/name
ip discovery

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants