The OpenCL Working Group has published a draft extension (cl_khr_cooperative_matrix) that brings cooperative matrix operations to OpenCL, developed with Arm, Intel, and Qualcomm. Cooperative matrix ops allow sub-groups of work-items to jointly load, multiply, and accumulate matrix tiles, mapping directly onto hardware matrix acceleration units for ML inference. The extension enables OpenCL implementations to accept SPIR-V modules using SPV_KHR_cooperative_matrix, with a new device query API for supported matrix sizes and types. A companion extension to expose these capabilities in OpenCL C (without hand-authored SPIR-V) is also in progress, with a clang frontend RFC published on LLVM Discourse. Both drafts are open for community feedback before finalization.
Sort: