Initialization¶

Selecting backend¶

VexCL provides the following backends:

OpenCL, built on top of Khronos C++ API. The backend is selected when VEXCL_BACKEND_OPENCL macro is defined, or by default. Link with libOpenCL.so on unix-like systems or with OpenCL.dll on Windows.
Boost.Compute. The backend is also based on OpenCL, but uses core functionality of the Boost.Compute library instead of somewhat outdated Khronos C++ API. The additional advantage is the increased interoperability between VexCL and Boost.Compute. The backend is selected when VEXCL_BACKEND_COMPUTE macro is defined. Link with libOpenCL.so/OpenCL.dll and make sure that Boost.Compute headers are in the include path.
CUDA, uses the NVIDIA CUDA technology. The backend is selected when VEXCL_BACKEND_CUDA macro is defined. Link with libcuda.so/cuda.dll. For the CUDA backend to work, CUDA Toolkit has to be installed, and NVIDIA CUDA compiler driver nvcc has to be in executable PATH and usable at runtime.

Whatever backend is selected, you will need to link to Boost.System and Boost.Filesystem libraries. Some systems may also require linking to Boost.Thread and Boost.Date_Time. All of those are distributed with Boost libraries collection.

Context initialization¶

VexCL transparently works with multiple compute devices that are present in the system. A VexCL context is initialized with a device filter, which is just a functor that takes a const reference to a vex::backend::device instance and returns a boolean value. Several standard filters are provided (see below), and one can easily add a custom functor. Filters may be combined with logical operators. All compute devices that satisfy the provided filter are added to the created context. In the example below all GPU devices that support double precision arithmetic are selected:

#include <iostream>
#include <stdexcept>
#include <vexcl/vexcl.hpp>

int main() {
    vex::Context ctx( vex::Filter::GPU && vex::Filter::DoublePrecision );

    if (!ctx) throw std::runtime_error("No devices available.");

    // Print out list of selected devices:
    std::cout << ctx << std::endl;
}

One of the most convenient filters is vex::Filter::Env which selects compute devices based on environment variables. It allows to switch the compute device without the need to recompile the program.

Each stateful object in VexCL, like vex::vector<T>, takes an STL vector of vex::backend::command_queue instances. The vex::Context class is just a convenient way to initialize and hold the command queues. Since it provides the corresponding type conversion operator, it also may be used directly for object initialization:

vex::vector<double> x(ctx, n);

But the users are not required to actually create a vex::Context instance. They may just use the command queues initialized elsewhere. In the following example the Boost.Compute is used as a backend and takes care of initializing the OpenCL context:

#include <iostream>
#include <boost/compute.hpp>

#define VEXCL_BACKEND_COMPUTE
#include <vexcl/vexcl.hpp>

int main() {
    boost::compute::command_queue bcq = boost::compute::system::default_queue();

    // Use Boost.Compute queue to allocate VexCL vectors:
    vex::vector<int> x({bcq}, 16);
}

Device filters¶

Common filters¶

These filters are supported for all backends:

vex::Filter::Any. Selects all available devices.
vex::Filter::DoublePrecision. Selects devices that support double precision arithmetics.
vex::Filter::Count(n). Selects first n devices that are passed through the filter. This filter should be the last in the filter chain. This will assure that it will be applied only to devices which passed all other filters. Otherwise, you could get less devices than planned (every time this filter is applied, its internal counter is decremented).
vex::Filter::Position(n). Selects single device at the given position.

vex::Filter::Env. Selects devices with respect to environment variables. Recognized variables are:

`OCL_DEVICE`	Name of the device or its substring.
`OCL_MAX_DEVICES`	Maximum number of devices to select. The effect is similar to the `vex::Filter::Count` filter above.
`OCL_POSITION`	Single device with the specified position in the list of available devices. The effect is similar to the `vex::Filter::Position` filter above.
`OCL_PLATFORM`	OpenCL platform name or its substring. Only supported for OpenCL-based backends.
`OCL_VENDOR`	OpenCL device vendor name or its substring. Only supported for OpenCL-based backends.
`OCL_TYPE`	OpenCL device type. Possible values are `CPU`, `GPU`, `ACCELERATOR`. Only supported for OpenCL-based backends.
`OCL_EXTENSION`	OpenCL device supporting the specified extension. Only supported for OpenCL-based backends.

vex::Filter::Exclusive(filter). This is a filter wrapper that allows to obtain exclusive access to compute devices. This may be helpful if several compute devices are present in the system and several processes are trying to grab a single device. The exclusivity is only guaranteed between processes that use the Exclusive filter wrapper.

OpenCL-specific filters¶

These filters are only available for OpenCL and Boost.Compute backends:

vex::Filter::CLVersion(major,minor). Selects devices that support the specified version of OpenCL standard.
vex::Filter::Extension(string). Selects devices that provide the specified extension.
vex::Filter::GLSharing. Selects devices that support OpenGL sharing extension. This is a shortcut for vex::Filter::Extension("cl_khr_gl_sharing").
vex::Filter::Type(cl_device_type). Selects devices with the specified device type. The device type is a bit mask.
vex::Filter::GPU. Selects GPU devices. This is a shortcut for vex::Filter::Type(CL_DEVICE_TYPE_GPU).
vex::Filter::CPU. Selects CPU devices. This is a shortcut for vex::Filter::Type(CL_DEVICE_TYPE_CPU).
vex::Filter::Accelerator. Selects Accelerator devices. This is a shortcut for vex::Filter::Type(CL_DEVICE_TYPE_ACCELERATOR).

Custom filters¶

In case more complex functionality is required than provided by the builtin filters, the users may introduce their own functors:

// Select a GPU with more than 4GiB of global memory:
vex::Context ctx(vex::Filter::GPU &&
                 [](const vex::backend::device &d) {
                     size_t GiB = 1024 * 1024 * 1024;
                     return d.getInfo<CL_DEVICE_GLOBAL_MEM_SIZE>() >= 4 * GiB;
                 });

Reference¶

class vex::Context¶

VexCL context.

Holds vectors of vex::backend::context and vex::backend::command_queue instances.

Public Functions

template<class DevFilter> inline explicit Context(DevFilter &&filter, vex::backend::command_queue_properties properties = 0)¶: Initializes context from the device filter.

inline Context(std::vector<vex::backend::context> c, std::vector<vex::backend::command_queue> q)¶: Initializes context from the user-supplied vectors of vex::backend::context and vex::backend::command_queues instances.

inline const std::vector<vex::backend::context> &context() const¶: Returns reference to the vector of initialized vex::backend::context instances.

inline vex::backend::context &context(unsigned d)¶: Returns reference to the specified vex::backend::context instance.

inline const std::vector<vex::backend::command_queue> &queue() const¶: Returns reference to the vector of initialized vex::backend::command_queue instances.

inline operator const std::vector<vex::backend::command_queue>&() const¶: Returns reference to the vector of initialized vex::backend::command_queue instances.

inline const vex::backend::command_queue &queue(unsigned d) const¶: Returns reference to the specified vex::backend::command_queue instance.

inline vex::backend::device device(unsigned d) const¶: Returns reference to the specified vex::backend::device instance.

inline size_t size() const¶: Returns number of initialized devices.

inline bool empty() const¶: Checks if the context is empty.

inline operator bool() const¶: Checks if the context is empty.

inline void finish() const¶: Waits for completion of all command queues in the context.

template<> std::vector<vex::backend::device> vex::backend::device_list<DevFilter>(DevFilter &&filter)¶: Returns vector of compute devices satisfying the given criteria without trying to initialize the contexts on the devices.