Skip to content
Snippets Groups Projects
Commit c4d1b3d1 authored by Pekka Jääskeläinen's avatar Pekka Jääskeläinen
Browse files

User override for the CPU local mem size

Added a new POCL_CPU_LOCAL_MEM_SIZE environment for overriding the
local memory size for the CPU devices.

Also added better error message for when total automatic local size
gets over the local memory size.
parent e3c126b2
Branches
Tags
No related merge requests found
...@@ -9,7 +9,7 @@ Notable User Facing Changes ...@@ -9,7 +9,7 @@ Notable User Facing Changes
- Added support for generic address spaces in the CPU drivers - Added support for generic address spaces in the CPU drivers
- Added basic support for cl_khr_subgroups for CPUs: A single - Added basic support for cl_khr_subgroups for CPUs: A single
subgroup that always executes the whole X-dimension's WIs. subgroup that always executes the whole X-dimension's WIs.
- Added initial (incomplete) support for - Added initial (incomplete) support for
cl_intel_required_subgroup_size for CPUs cl_intel_required_subgroup_size for CPUs
- AlmaIF's OpenASIP backend now supports a standalone mode. - AlmaIF's OpenASIP backend now supports a standalone mode.
It generates a standalone C program from a kernel launch, which It generates a standalone C program from a kernel launch, which
...@@ -23,7 +23,9 @@ Notable User Facing Changes ...@@ -23,7 +23,9 @@ Notable User Facing Changes
CBS is expected to work for kernels that POCL's current kernel compiler CBS is expected to work for kernels that POCL's current kernel compiler
does not support. Currently, CBS can be manually enabled by setting does not support. Currently, CBS can be manually enabled by setting
the environment variable `POCL_WORK_GROUP_METHOD=cbs`. the environment variable `POCL_WORK_GROUP_METHOD=cbs`.
- Added a new POCL_CPU_LOCAL_MEM_SIZE environment for overriding the
local memory size for the CPU devices.
Notable Fixes Notable Fixes
------------- -------------
......
...@@ -145,6 +145,11 @@ pocl. ...@@ -145,6 +145,11 @@ pocl.
default cache directory will be used, which is ``$XDG_CACHE_HOME/pocl/kcache`` default cache directory will be used, which is ``$XDG_CACHE_HOME/pocl/kcache``
(if set) or ``$HOME/.cache/pocl/kcache/`` on Unix-like systems. (if set) or ``$HOME/.cache/pocl/kcache/`` on Unix-like systems.
- **POCL_CPU_LOCAL_MEM_SIZE**
Set the local memory size of the CPU devices (pthread, basic) to the
given amount in bytes instead of the default one.
- **POCL_DEBUG** - **POCL_DEBUG**
Enables debug messages to stderr. This will be mostly messages from error Enables debug messages to stderr. This will be mostly messages from error
......
...@@ -247,9 +247,9 @@ pocl_basic_init (unsigned j, cl_device_id device, const char* parameters) ...@@ -247,9 +247,9 @@ pocl_basic_init (unsigned j, cl_device_id device, const char* parameters)
#endif #endif
/* hwloc probes OpenCL device info at its initialization in case /* hwloc probes OpenCL device info at its initialization in case
the OpenCL extension is enabled. This causes to printout the OpenCL extension is enabled. This causes to printout
an unimplemented property error because hwloc is used to an unimplemented property error because hwloc is used to
initialize global_mem_size which it is not yet. Just put initialize global_mem_size which it is not yet. Just put
a nonzero there for now. */ a nonzero there for now. */
device->global_mem_size = 1; device->global_mem_size = 1;
err = pocl_topology_detect_device_info(device); err = pocl_topology_detect_device_info(device);
...@@ -266,6 +266,9 @@ pocl_basic_init (unsigned j, cl_device_id device, const char* parameters) ...@@ -266,6 +266,9 @@ pocl_basic_init (unsigned j, cl_device_id device, const char* parameters)
pocl_cpuinfo_detect_device_info(device); pocl_cpuinfo_detect_device_info(device);
pocl_set_buffer_image_limits(device); pocl_set_buffer_image_limits(device);
device->local_mem_size = pocl_get_int_option ("POCL_CPU_LOCAL_MEM_SIZE",
device->local_mem_size);
if (device->vendor_id == 0) if (device->vendor_id == 0)
device->vendor_id = CL_KHRONOS_VENDOR_ID_POCL; device->vendor_id = CL_KHRONOS_VENDOR_ID_POCL;
......
...@@ -222,6 +222,9 @@ pocl_pthread_init (unsigned j, cl_device_id device, const char* parameters) ...@@ -222,6 +222,9 @@ pocl_pthread_init (unsigned j, cl_device_id device, const char* parameters)
pocl_cpuinfo_detect_device_info(device); pocl_cpuinfo_detect_device_info(device);
pocl_set_buffer_image_limits(device); pocl_set_buffer_image_limits(device);
device->local_mem_size = pocl_get_int_option ("POCL_CPU_LOCAL_MEM_SIZE",
device->local_mem_size);
/* in case hwloc doesn't provide a PCI ID, let's generate /* in case hwloc doesn't provide a PCI ID, let's generate
a vendor id that hopefully is unique across vendors. */ a vendor id that hopefully is unique across vendors. */
const char *magic = "pocl"; const char *magic = "pocl";
......
...@@ -235,9 +235,21 @@ setup_kernel_arg_array_with_locals (void **arguments, void **arguments2, ...@@ -235,9 +235,21 @@ setup_kernel_arg_array_with_locals (void **arguments, void **arguments2,
size_t size = meta->local_sizes[i]; size_t size = meta->local_sizes[i];
arguments[j] = &arguments2[j]; arguments[j] = &arguments2[j];
arguments2[j] = start; arguments2[j] = start;
if ((size_t)(start - local_mem + size) > local_mem_size)
{
size_t total_auto_local_size = 0;
for (i = 0; j < meta->num_locals; ++j)
{
total_auto_local_size += meta->local_sizes[j];
}
POCL_ABORT (
"PoCL detected an OpenCL program error: "
"%d automatic local buffer(s) with total size %lu "
"bytes doesn't fit to the local memory of size %lu\n",
meta->num_locals, total_auto_local_size, local_mem_size);
}
start += size; start += size;
start = align_ptr (start); start = align_ptr (start);
assert ((size_t)(start - local_mem) <= local_mem_size);
} }
} }
} }
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment