-
Pekka Jääskeläinen authored
Setting POCL_TRACING=cq collects kernel execution times by force enabling the command queue profiling feature, and dumps collected stats atexit(). The purpose of this feature is to enable implementation of minimally intrusive profile collection; the profile data collector can choose the occasions when it gathers the time stamp data from the events. The impact to the observed execution profile is minimized by avoiding writing any logs, copying objects or such while collecting the data during execution. It relies on the standard event timestamps to enable devices update them as (and when) they see fit during the execution. The drawback is accumulation of cl_object garbage, which should be taken in account in the data collection interval; the collector should release the events and the extra data objects they hold often enough to avoid memory consumption to become a problem. The current version does not perform garbage collection, but assumes the alive OpenCL objects that are kept until the exit is a non-problem, which is clearly the case with most of the OpenCL programs which are rather simple; not long running, nor launch a lot of commands over their lifetime. The default profile data collector counts only kernel commands at the moment. Collecting stats of data transfers would be a useful addition.
0c3147cePekka Jääskeläinen authoredSetting POCL_TRACING=cq collects kernel execution times by force enabling the command queue profiling feature, and dumps collected stats atexit(). The purpose of this feature is to enable implementation of minimally intrusive profile collection; the profile data collector can choose the occasions when it gathers the time stamp data from the events. The impact to the observed execution profile is minimized by avoiding writing any logs, copying objects or such while collecting the data during execution. It relies on the standard event timestamps to enable devices update them as (and when) they see fit during the execution. The drawback is accumulation of cl_object garbage, which should be taken in account in the data collection interval; the collector should release the events and the extra data objects they hold often enough to avoid memory consumption to become a problem. The current version does not perform garbage collection, but assumes the alive OpenCL objects that are kept until the exit is a non-problem, which is clearly the case with most of the OpenCL programs which are rather simple; not long running, nor launch a lot of commands over their lifetime. The default profile data collector counts only kernel commands at the moment. Collecting stats of data transfers would be a useful addition.
Loading