Skip to content
Snippets Groups Projects
Commit 4ae25e8e authored by Michal Babej's avatar Michal Babej
Browse files

update documentation & rename old env variables

parent 2a6935fe
No related branches found
No related tags found
No related merge requests found
......@@ -47,6 +47,8 @@ Notable User Facing Changes
- `-cl-fast-relaxed-math` now defaults to `-ffp-contract=fast`, previously it was `-ffp-contract=on`.
- CPU drivers: renamed 'basic' to 'cpu-minimal' and 'pthread' driver to 'cpu',
to reflect the hardware they're driving instead of implementation details.
- POCL_MAX_PTHREAD_COUNT renamed to POCL_CPU_MAX_CU_COUNT; the old env. variable is deprecated
but still works
Notable Fixes
-------------
......@@ -59,6 +61,13 @@ Notable Fixes
- CPU driver: Made kernel debug info more robust especially
when building the device program from SPIR-V.
Known issues
------------
- With LLVM 16, when running CTS with the offline compilation mode (= via SPIR-V), Clang + SPIR-V translator
produces invalid SPIR-V for several tests.
Related Khronos issue: https://github.com/KhronosGroup/SPIRV-LLVM-Translator/issues/2008
3.1 December 2022
=================
......
......@@ -91,8 +91,8 @@ Setup:
This will cause all kernels to compile with debuginfo.
* ``export POCL_LEAVE_KERNEL_COMPILER_TEMP_FILES=1``
This will leave the source files in PoCL's cache.
* Optional: ``export POCL_MAX_PTHREAD_COUNT=1``
This limits the pthread driver to a single worker thread.
* Optional: ``export POCL_CPU_MAX_CU_COUNT=1``
This limits the cpu driver to a single worker thread.
* Run your application with gdb, as usual.
Example 1:
......@@ -160,7 +160,7 @@ Example 2:
Lets say we want to step the "dot_product" kernel from the previous example. Launch gdb::
POCL_MAX_PTHREAD_COUNT=1 gdb ./example
POCL_CPU_MAX_CU_COUNT=1 gdb ./example
Make a breakpoint on the kernel name::
......
......@@ -85,11 +85,10 @@ pocl.
- **POCL_AFFINITY**
Linux-only, specific to pthread driver. If set to 1, each thread of
the pthread CPU driver sets its affinity to its index. This may be
useful with very long running kernels, or when using subdevices
(lets any idle cores enter deeper sleep). Defaults to 0 (most
people don't need this).
Linux-only, specific to 'cpu' driver. If set to 1, each thread of
the driver sets its affinity to its index. This may be useful
with very long running kernels, or when using subdevices.
Defaults to 0 (most people don't need this).
- **POCL_BINARY_SPECIALIZE_WG**
......@@ -178,7 +177,7 @@ pocl.
kernels on the host CPU. No multithreading.
* **cpu** Execution of OpenCL kernels on the host CPU using
all CPU threads.
(by default) all available CPU threads.
* **cuda** An experimental driver that uses libcuda to execute on NVIDIA GPUs.
......@@ -195,16 +194,16 @@ pocl.
* **level0** An experimental driver that uses libze to execute on Intel GPUs.
If POCL_DEVICES is not set, one pthread device will be used.
If POCL_DEVICES is not set, one cpu device will be used.
To specify parameters for drivers, the POCL_<drivername><instance>_PARAMETERS
environment variable can be specified (where drivername is in uppercase).
Example::
export POCL_DEVICES="pthread ttasim ttasim"
export POCL_DEVICES="cpu ttasim ttasim"
export POCL_TTASIM0_PARAMETERS="/path/to/my/machine0.adf"
export POCL_TTASIM1_PARAMETERS="/path/to/my/machine1.adf"
Creates three devices, one CPU device with pthread multithreading and two
Creates three devices, one 'cpu' device with multithreading and two
TTA device simulated with the ttasim. The ttasim devices gets a path to
the architecture description file of the tta to simulate as a parameter.
POCL_TTASIM0_PARAMETERS will be passed to the first ttasim driver instantiated
......@@ -252,10 +251,10 @@ pocl.
kernel bitcode (parallel.bc) only with some drivers).
Defaults to 0 if CMAKE_BUILD_TYPE=Debug and 1 otherwise.
- **POCL_MAX_PTHREAD_COUNT**
- **POCL_CPU_MAX_CU_COUNT**
The maximum number of threads created for work group execution in the
pthread device driver. The default is to determine this from the number of
'cpu' device driver. The default is to determine this from the number of
hardware threads available in the CPU.
- **POCL_MAX_WORK_GROUP_SIZE**
......
......@@ -142,6 +142,6 @@ add_test(NAME "${TS_NAME}"
set_tests_properties(${TS_NAME}
PROPERTIES
ENVIRONMENT "POCL_MAX_PTHREAD_COUNT=1"
ENVIRONMENT "POCL_CPU_MAX_CU_COUNT=1"
PASS_REGULAR_EXPRESSION "100% tests passed, 0 tests failed out of"
LABELS "chip-spv")
......@@ -237,15 +237,15 @@ pocl_pthread_init (unsigned j, cl_device_id device, const char* parameters)
* but if the user requests, lower it */
/* old env variable */
int max_threads = pocl_get_int_option ("POCL_MAX_PTHREAD_COUNT", fallback);
int max_threads = pocl_get_int_option ("POCL_MAX_PTHREAD_COUNT", 0);
if (max_threads < 0)
max_threads = pocl_get_int_option ("POCL_CPU_MAX_COUNT", fallback);
if (max_threads <= 0)
max_threads = pocl_get_int_option ("POCL_CPU_MAX_CU_COUNT", fallback);
/* old env variable */
int min_threads = pocl_get_int_option ("POCL_PTHREAD_MIN_THREADS", 1);
if (min_threads < 0)
min_threads = pocl_get_int_option ("POCL_CPU_MIN_COUNT", 1);
int min_threads = pocl_get_int_option ("POCL_PTHREAD_MIN_THREADS", 0);
if (min_threads <= 0)
min_threads = pocl_get_int_option ("POCL_CPU_MIN_CU_COUNT", 1);
device->max_compute_units
= max ((unsigned)max_threads, (unsigned)min_threads);
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment