Managing memory¶

Allocating¶

The vex::vector<T> class constructor accepts a const reference to std::vector<vex::backend::command_queue>. A vex::Context instance may be conveniently converted to this type, but it is also possible to initialize the command queues elsewhere (e.g. with the OpenCL backend vex::backend::command_queue is typedefed to cl::CommandQueue), thus completely eliminating the need to create a vex::Context. Each command queue in the list should uniquely identify a single compute device.

The contents of the created vector will be partitioned across all devices that were present in the queue list. The size of each partition will be proportional to the device bandwidth, which is measured the first time the device is used. All vectors of the same size are guaranteed to be partitioned consistently, which minimizes inter-device communication.

In the example below, three device vectors of the same size are allocated. Vector A is copied from the host vector a, and the other vectors are created uninitialized:

const size_t n = 1024 * 1024;
vex::Context ctx( vex::Filter::Any );

std::vector<double> a(n, 1.0);

vex::vector<double> A(ctx, a);
vex::vector<double> B(ctx, n);
vex::vector<double> C(ctx, n);

Assuming that the current system has an NVIDIA GPU, an AMD GPU, and an Intel CPU installed, possible partitioning may look like this:

template<typename T> class vex::vector : public vex::vector_expression<Expr>¶

Device vector.

Public Functions

inline vector()¶: Empty constructor.

inline vector(const vector &v)¶: Copy constructor.

inline vector(vector &&v) noexcept¶: Move constructor.

inline vector(const backend::command_queue &q, const backend::device_vector<T> &buffer, size_t size = 0)¶

Wraps a native buffer without owning it.

May be used to apply VexCL functions to buffers allocated and managed outside of VexCL.

inline vector(const std::vector<backend::command_queue> &queue, size_t size, const T *host = 0, backend::mem_flags flags = backend::MEM_READ_WRITE)¶: Creates vector of the given size and optionally copies host data.

inline vector(size_t size, const T *host = 0, backend::mem_flags flags = backend::MEM_READ_WRITE)¶

Creates vector of the given size and optionally copies host data.

This version uses the most recently created VexCL context.

inline vector(const std::vector<backend::command_queue> &queue, const std::vector<T> &host, backend::mem_flags flags = backend::MEM_READ_WRITE)¶: Creates new device vector and copies the host vector.

inline vector(const std::vector<T> &host, backend::mem_flags flags = backend::MEM_READ_WRITE)¶

Creates new device vector and copies the host vector.

This version uses the most recently created VexCL context.

template<class Expr> inline vector(const Expr &expr)¶

Constructs new vector from vector expression.

This will fail if VexCL is unable to automatically determine the expression size and the compute devices to use.

inline void swap(vector &v)¶: Swap function.

inline void resize(const vector &v, backend::mem_flags flags = backend::MEM_READ_WRITE)¶

Resizes the vector.

Borrows devices, size, and data from the given vector. Any data contained in the resized vector will be lost as a result.

inline void resize(const std::vector<backend::command_queue> &queue, size_t size, const T *host = 0, backend::mem_flags flags = backend::MEM_READ_WRITE)¶

Resizes the vector with the given parameters.

This is equivalent to reconstructing the vector with the given parameters. Any data contained in the resized vector will be lost as a result.

inline void resize(const std::vector<backend::command_queue> &queue, const std::vector<T> &host, backend::mem_flags flags = backend::MEM_READ_WRITE)¶

Resizes the vector.

This is equivalent to reconstructing the vector with the given parameters. Any data contained in the resized vector will be lost as a result.

inline void resize(size_t size, const T *host = 0, backend::mem_flags flags = backend::MEM_READ_WRITE)¶: Resizes the vector.

inline void clear()¶

Fills vector with zeros.

This does not change the vector size!

inline const backend::device_vector<T> &operator()(unsigned d = 0) const¶: Returns memory buffer located on the given device.

inline backend::device_vector<T> &operator()(unsigned d = 0)¶: Returns memory buffer located on the given device.

inline const_iterator begin() const¶: Returns const iterator to the first element of the vector.

inline const_iterator end() const¶: Returns const iterator referring to the past-the-end element in the vector.

inline iterator begin()¶: Returns iterator to the first element of the vector.

inline iterator end()¶: Returns iterator referring to the past-the-end element in the vector.

inline const element operator[](size_t index) const¶: Access vector element.

inline element operator[](size_t index)¶: Access vector element.

inline const element at(size_t index) const¶: at() style access is identical to operator[]

inline element at(size_t index)¶: at() style access is identical to operator[]

inline size_t size() const¶: Returns vector size.

inline size_t nparts() const¶

Returns number of vector parts.

Each partition is located on single device.

inline size_t part_size(unsigned d) const¶: Returns vector part size on the given device.

inline size_t part_start(unsigned d) const¶: Returns index of the first element located on the given device.

inline const std::vector<backend::command_queue> &queue_list() const¶: Returns reference to the vector of command queues used to construct the vector.

inline backend::device_vector<T>::mapped_array map(unsigned d = 0)¶

Maps vector part located on the given device to a host array.

This returns a smart pointer that will be unmapped automatically upon destruction

inline backend::device_vector<T>::mapped_array map(unsigned d = 0) const¶

Maps vector part located on the given device to a host array.

This returns a smart pointer that will be unmapped automatically upon destruction

inline const vector &operator=(const vector &x)¶: Copy assignment.

inline const vector &operator=(vector &&v)¶: Move assignment.

template<class Expr> inline auto operator=(const Expr &expr) -> typename std::enable_if<boost::proto::matches<typename boost::proto::result_of::as_expr<Expr>::type, vector_expr_grammar>::value, const vector&>::type¶: Expression assignment operator.

template<class Expr> inline auto operator+=(const Expr &expr) -> typename std::enable_if<boost::proto::matches<typename boost::proto::result_of::as_expr<Expr>::type, vector_expr_grammar>::value, const vector&>::type¶: Expression assignment operator.

template<class Expr> inline auto operator-=(const Expr &expr) -> typename std::enable_if<boost::proto::matches<typename boost::proto::result_of::as_expr<Expr>::type, vector_expr_grammar>::value, const vector&>::type¶: Expression assignment operator.

template<class Expr> inline auto operator*=(const Expr &expr) -> typename std::enable_if<boost::proto::matches<typename boost::proto::result_of::as_expr<Expr>::type, vector_expr_grammar>::value, const vector&>::type¶: Expression assignment operator.

template<class Expr> inline auto operator/=(const Expr &expr) -> typename std::enable_if<boost::proto::matches<typename boost::proto::result_of::as_expr<Expr>::type, vector_expr_grammar>::value, const vector&>::type¶: Expression assignment operator.

template<class Expr> inline auto operator%=(const Expr &expr) -> typename std::enable_if<boost::proto::matches<typename boost::proto::result_of::as_expr<Expr>::type, vector_expr_grammar>::value, const vector&>::type¶: Expression assignment operator.

template<class Expr> inline auto operator&=(const Expr &expr) -> typename std::enable_if<boost::proto::matches<typename boost::proto::result_of::as_expr<Expr>::type, vector_expr_grammar>::value, const vector&>::type¶: Expression assignment operator.

template<class Expr> inline auto operator|=(const Expr &expr) -> typename std::enable_if<boost::proto::matches<typename boost::proto::result_of::as_expr<Expr>::type, vector_expr_grammar>::value, const vector&>::type¶: Expression assignment operator.

template<class Expr> inline auto operator^=(const Expr &expr) -> typename std::enable_if<boost::proto::matches<typename boost::proto::result_of::as_expr<Expr>::type, vector_expr_grammar>::value, const vector&>::type¶: Expression assignment operator.

template<class Expr> inline auto operator<<=(const Expr &expr) -> typename std::enable_if<boost::proto::matches<typename boost::proto::result_of::as_expr<Expr>::type, vector_expr_grammar>::value, const vector&>::type¶: Expression assignment operator.

template<class Expr> inline auto operator>>=(const Expr &expr) -> typename std::enable_if<boost::proto::matches<typename boost::proto::result_of::as_expr<Expr>::type, vector_expr_grammar>::value, const vector&>::type¶: Expression assignment operator.

class element¶

template<class vector_type, class element_type> class iterator_type : public boost::iterator_facade<iterator_type<vector_type, element_type>, T, std::random_access_iterator_tag, element_type>¶

Copying¶

The vex::copy() function allows to copy data between host and compute device memory spaces. There are two forms of the function – a simple one which accepts whole vectors, and an STL-like one, which accepts pairs of iterators:

std::vector<double> h(n);       // Host vector.
vex::vector<double> d(ctx, n);  // Device vector.

// Simple form:
vex::copy(h, d);    // Copy data from host to device.
vex::copy(d, h);    // Copy data from device to host.

// STL-like form:
vex::copy(h.begin(), h.end(), d.begin()); // Copy data from host to device.
vex::copy(d.begin(), d.end(), h.begin()); // Copy data from device to host.

The STL-like variant can copy sub-ranges of the vectors, or copy data from/to raw host pointers.

Vectors also overload the array subscript operator, vex::vector::operator[](), so that users may directly read or write individual vector elements. This operation is highly ineffective and should be used with caution. Iterators allow for element access as well, so that STL algorithms may in principle be used with device vectors. This would be very slow but may be used as a temporary building block.

Another option for host-device data transfer is mapping device memory buffer to a host array. The mapped array then may be transparently read or written. The method vex::vector::map() maps the d-th partition of the vector and returns the mapped array:

vex::vector<double> X(ctx, N);
{
    auto mapped_ptr = X.map(0); // Unmapped automatically when goes out of scope
    for(size_t i = 0; i < X.part_size(0); ++i)
        mapped_ptr[i] = host_function(i);
}

Shared virtual memory¶

Both OpenCL 2.0 and CUDA 6.0 allow to share the same virtual address range between the host and the compute devices, so that there is no longer need to copy buffers between devices. In other words, no keeping track of buffers and explicitly copying them across devices! Just use shared pointers. OpenCL 2.0 calls this concept Shared Virtual Memory (SVM), and CUDA 6.0 talks about Unified Memory. In VexCL, both of these are abstracted into vex::svm_vector<T> class.

The vex::svm_vector<T> constructor, as opposed to vex::vector<T>, takes single instance of vex::backend::command_queue. This is because the SVM vector has to be associated with a single device context. The SVM vectors in VexCL may be used in the same way normal vectors are used.

Example:

// Allocate SVM vector for the first device in context:
vex::svm_vector<int> x(ctx.queue(0), n);

// Fill the vector on the host.
{
    auto p = x.map(vex::backend::MAP_WRITE);
    for(int i = 0; i < n; ++i)
        p[i] = i * 2;
}

template<typename T> class vex::svm_vector : public vex::vector_expression<Expr>, public vex::vector_expression<Expr>, public vex::vector_expression<Expr>¶

Shared Virtual Memory wrapper class.

Public Functions

inline svm_vector(const cl::CommandQueue &q, size_t n)¶: Allocates SVM vector on the given device.

inline size_t size() const¶: Returns size of the SVM vector.

inline const cl::CommandQueue &queue() const¶: Returns reference to the command queue associated with the SVM vector.

inline mapped_pointer map(cl_map_flags map_flags = CL_MAP_READ | CL_MAP_WRITE)¶

Returns host pointer ready to be either read or written by the host.

This returns a smart pointer that will be unmapped automatically upon destruction

inline const svm_vector &operator=(const svm_vector &other)¶: Copy assignment operator.

struct unmapper¶