Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
P
POCL
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
TRACES-public
Vortex
POCL
Commits
99ecbd01
Commit
99ecbd01
authored
2 years ago
by
Topi Leppanen
Browse files
Options
Downloads
Patches
Plain Diff
AlmaIF: update CHANGES and documentation
parent
b9412839
No related branches found
No related tags found
No related merge requests found
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
CHANGES
+15
-1
15 additions, 1 deletion
CHANGES
doc/sphinx/source/almaif.rst
+65
-23
65 additions, 23 deletions
doc/sphinx/source/almaif.rst
with
80 additions
and
24 deletions
CHANGES
+
15
−
1
View file @
99ecbd01
...
...
@@ -13,7 +13,21 @@ Notable User Facing Changes
* and other small fixes and improvements
- All device names used to control the platform setup via
POCL_DEVICES are now lower case (as documented in the user manual).
- Built-in kernel support for CPU, CUDA and Accel-devices
* The .cl files in the driver directories are compiled to built-in kernels.
* Experimental CUDNN support for Cuda-driver through built-in kernels.
- Major update to the custom device driver (Accel -> Now renamed to AlmaIF)
* Support more device types implementing AlmaIF
+ Memory-mapped accelerator (for SoC FPGAs)
+ Experimental XRT backend (for PCIe FPGAs)
+ Instruction-set simulator (TTASim)
+ Emulation device on host CPU
- TTASim-driver will be deprecated as soon as all of its functionality
can be achieved with the almaif-driver. Most of it has already been
ported to almaif/TTASimDevice.cc
* This move allows controlling TTASIM simulator and FPGA device
(and in future, RTL simulator) with the same runtime driver,
which reduces code duplication.
3.0 June 2022
=============
...
...
This diff is collapsed.
Click to expand it.
doc/sphinx/source/almaif.rst
+
65
−
23
View file @
99ecbd01
...
...
@@ -6,6 +6,19 @@ The ``almaif`` driver can be used for easy integration of custom fixed-function
accelerators through a standardized hardware interface and a standardized
procedure for enqueuing commands.
More information behind the principles of this driver can be found from the
following two publications:
T. Leppänen, P. Mousouliotis, G. Keramidas, J. Multanen and P. Jääskeläinen,
"Unified OpenCL Integration Methodology for FPGA Designs,"
2021 IEEE Nordic Circuits and Systems Conference (NorCAS), 2021, pp. 1-7,
https://doi.org/10.1109/NorCAS53631.2021.9599861
T. Leppänen, A. Lotvonen, P. Jääskeläinen, 2022
"Cross-vendor programming abstraction for diverse heterogeneous platforms,"
Frontiers in Computer Science, vol. 4,
https://doi.org/10.3389/fcomp.2022.945652
Interface
---------
...
...
@@ -163,7 +176,6 @@ in the ``clCreateProgramWithBuiltInKernels`` call:
- 65535
- Special flag to communicate that device supports compiled kernels.
This list will be expanded in the future.
The full list of currently supported built-in kernels is maintained in
lib/CL/devices/builtin_kernels.{cc,hh}
...
...
@@ -175,7 +187,7 @@ instructions in the `OpenASIP manual <http://openasip.org/user_manual/TCE.pdf>`_
in the section titled System-on-a-Chip design with AlmaIF Integrator. Make sure
to check the accelerator base address from Vivado.
A
lternativa
ly, to run tests that generate both TTA and High-level synthesis
A
dditional
ly, to run tests that generate both TTA and High-level synthesis
(HLS) based accelerators for PYNQ-Z1 device you need to enable few variables
in the CMAKE configuration.
First, set CMAKE variable VIVADO_PATH to point to the directory with the
...
...
@@ -194,9 +206,10 @@ The bitstreams themselves are not automatically built with PoCL build process, b
with a separate 'make bitstreams' command. This generates the bitstreams to
build/examples/accel/bitstreams and build/examples/accel/hls/bitstreams directories.
Once bitstreams have been built, build PoCL on the PYNQ-Z1 device.
(You don't need to set ENABLE_TCE or VITIS/VIVADO_HLS_PATH) on it.
Set the environment variable PYNQ_AVAILABLE=1 to enable the FPGA tests.
(You don't need to set ENABLE_TCE or VITIS/VIVADO_HLS_PATH).
Copy the bistreams directories (and in case of TTA, also the firmware_imgs
directory
, hashes.txt
and example0_*.poclbins)
directory and example0_*.poclbins)
to their correct PoCL build directories on PYNQ.
Finally, run ``../tools/scripts/run_almaif_tests --pynq`` to run the test programn
on the FPGA device.
...
...
@@ -205,16 +218,26 @@ on the FPGA device.
Driver arguments are used to tell pocl where the accelerator is and what
functions it supports. To run this example manually, execute::
functions it supports. To run examples manually, after programming the
fpga, execute::
POCL_DEVICES=almaif POCL_ALMAIF0_PARAMETERS=0x4
3C
00000,<device_name>,1,2 ./accel_example
POCL_DEVICES=almaif POCL_ALMAIF0_PARAMETERS=0x4
00
00000,<device_name>,1,2 ./accel_example
The environment variables define an accelerator with base physical address of
0x4
3C
0_0000 that can execute pocl.add.i32 and pocl.mul.i32. If the device requires
0x4
00
0_0000 that can execute pocl.add.i32 and pocl.mul.i32. If the device requires
firmware to be loaded in, pocl will attempt to load it from <device_name>.img.
When running the example, verify that the address given in the parameter matches
the base address of the accelerator.
Note that as the driver requires write access to ``/dev/mem`` for memory
mapping, you may need to execute the application with elevated privileges. In
this case, note that ``sudo`` by default overrides your environment variables.
You can either assign them in the same command, or use ``sudo`` with the
``--preserve-env`` switch.
The driver supports instruction-set simulation for TTA devices. To enable it,
set the base address to 0xB, and set the <device_name> to point to a TTA
device's .adf-file and compiled firmware binary (.tpef-file). PoCL will then
...
...
@@ -224,16 +247,9 @@ start up the simulation with <device_name>.adf and, if it exists, <device_name>.
There's an alternative way to emulate the accelerator in software by
setting the base physical address to 0xE. This directs the driver to instead
use a software emulating function from almaif.cc. No changes to accel_example.cpp
are needed to run the emulation.
Note that as the driver requires write access to ``/dev/mem`` for memory
mapping, you may need to execute the application with elevated privileges. In
this case, note that ``sudo`` by default overrides your environment variables.
You can either assign them in the same command, or use ``sudo`` with the
``--preserve-env`` switch.
use a software emulating function from almaif/EmulationDevice.cc.
No changes to the source OpenCL host program (e.g. accel_example.cpp)
when switching between emulation, instruction-set simulation or FPGA execution
Wrapping new hardware component
-------------------------------
...
...
@@ -268,13 +284,39 @@ be copied to PYNQ-Z1.
The generate_hls_project.tcl file sets the base address of the accelerator
to a physical address 0x40000000. This base address is given to PoCL through
an environment variable::
export POCL_DEVICES=almaif
export POCL_ALMAIF0_PARAMETERS="0x40000000,dummy,1,2"
export POCL_DEVICES=almaif
export POCL_ALMAIF0_PARAMETERS="0x40000000,dummy,1,2"
The bitstream can be loaded on the FPGA with various ways. PYNQ-Z1 image
includes a python library to do it, which can be used with a following one-liner::
sudo -E python -c "from pynq import Overlay;Overlay('examples/accel/hls/bitstreams/vecadd_1.bit')"
After that, it's possible to run the examples/accel/accel_example tests.
The ctest that runs vector addition and multiplication is the following::
sudo -E ctest -R "examples/accel/.*i32"
sudo -E python -c "from pynq import Overlay;Overlay('examples/accel/hls/bitstreams/vecadd_1.bit')"
After that, it's possible to run the examples/accel/accel_example program.
Using this work
---------------
If you are utilizing, further developing or comparing to the AlmaIF driver of POCL
in your academic work, please cite one of the relevant publications::
@INPROCEEDINGS{leppanen2021,
AUTHOR={Leppänen, Topi and Mousouliotis, Panagiotis and Keramidas, Georgios and Multanen, Joonas and Jääskeläinen, Pekka},
BOOKTITLE={2021 IEEE Nordic Circuits and Systems Conference (NorCAS)},
TITLE={Unified OpenCL Integration Methodology for FPGA Designs},
YEAR={2021},
PAGES={1-7},
DOI={10.1109/NorCAS53631.2021.9599861}
}
@ARTICLE{leppanen2022,
AUTHOR={Leppänen, Topi and Lotvonen, Atro and Jääskeläinen, Pekka},
TITLE={Cross-vendor programming abstraction for diverse heterogeneous platforms},
JOURNAL={Frontiers in Computer Science},
VOLUME={4},
YEAR={2022},
URL={https://www.frontiersin.org/articles/10.3389/fcomp.2022.945652},
DOI={10.3389/fcomp.2022.945652},
ISSN={2624-9898},
}
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment