Mathematical functions - Built-in functions

clGetDeviceInfo and the types and descriptions of the properties

Chapter 5. The OpenCL C language

6. Built-in functions

6.2. Mathematical functions

results[idx - offset]= 0;

return;

}

results[idx - offset]= 1;

}

In the next sample code the arguments of the kernel are two pointers to one or two dimensional datasets (input, output). The goal of the kernel is to find the maximum⁴ of the direct neighbors⁵ of the data element indentified by the index of the workitem in the dataset input, and write the result into the position of dataset output with the same index.

Example 5.11. localMaximum.k

__kernel void localMaximum(__global float* input, __global float* output) {

uint dim= get_work_dim(), pos, size0, size;

float max;

if ( dim == 1 ) {

pos= get_global_id(0);

size= get_global_size(0);

max= input[pos];

if ( pos - 1 >= 0 && input[pos-1] > max ) max= input[pos-1];

if ( pos + 1 < size && input[pos+1] > max ) max= input[pos-1];

output[pos]= max;

} else {

size0= get_global_size(0);

size= size*get_global_size(1);

pos= get_global_id(1)*get_global_size(0) + get_global_id(0);

max= input[pos];

if ( pos - size0 >= 0 && input[pos - size0] > max ) max= input[pos-size0];

if ( pos + size0 < size && input[pos + size0] > max ) max= input[pos+size0];

if ( pos - 1 >= 0 && input[pos - 1] > max ) max= input[pos-1];

if ( pos + 1 < size && input[pos + 1] > max ) max= input[pos+1];

output[pos]= max;

} }

6.2. Mathematical functions

The functions (cos, sin, tan, acos, asin, atan, atan2, cosh, sinh, tanh, exp, frexp, ldexp, log, log10, modf, pow, sqrt, ceil, fabs, floor, fmod) declared in the math.h header of ANSI C are available with the same names for float, double, floatn and doublen type parameters, as well. In each case, the type of the returned value is the same as the type of the argument. When the functions are called for vector data types, the operations are performed componentwise. When atomic typed values are used as the arguments of the functions, one can utilize the implicit conversion of them. However, when the arguments of the functions have non-floating-point vector types, explicit conversion functions have to be applied to pass them to these functions. For

4In digital signal processing or image processing this operation is called grayscale dilation. The structuring element of this morphological operation covers the direct neighbors only.

5In the case of one dimensional data sets, the indices of direct neighbors of the element with index i are i-1 and i+1.

example, the function sqrt cannot be called to integer vector types without the conversion to floating-point vector types.

The further functions are only listed, the detailed specification and description can be found in the OpenCL specification. The aim of the listing is to give the reader an impression about the mathematical operations that can be implemented by simply calling the proper built-in function of the OpenCL C environment:

• further trigonometric and exponential functions (atanh, atanpi, atan2pi, exp2, exp10, ilogb, ldexp, pown, powr, rootn, etc.),

• functions to find minima/maxima (fmax, fmin, maxmag, minmag, etc.),

• functions determining the rational part of real numbers (fract, stb.),

• extraction of mantiss and exponent (frexp, modf),

• computation of functions gamma and log gamma (tgamma, lgamma_r),

• fast MAD⁶ operation: a*b + c (mad).

• functions related to floating-point division (remainder, remquo),

• functions optimized to work with half-precision floating-point values (half_*),

• functions working with values natively supported by the OpenCL implementation (native_*),

• conversion of degrees to radians (radians),

• sign, step and smoothstep functions (sign, step, smoothstep),

• function returning the number of components of a vector (vec_step),

• functions producing permutations of vector components (shuffle, shuffle2).

Beside the mathematical functions, one can use the following constants when working with single- and double-precision floating-point numbers (note that the constants are available in the host program using the prefix CL_, as well):

float double Description

FLT_MAX DBL_MAX The largest positive value that can be represented.

FLT_MIN DBL_MIN The smallest negative value that can be represented.

FLT_EPSILON DBL_EPSILON The smallest positive value that can be represented.

The macros defining the most important mathematical constants are summarized below:

float double Description

M_E_F M_E The value of e.

M_LOG2E_F M_LOG2E The value of log2 e.

M_LOG10E_F M_LOG10E The value of log10 e.

M_LN2_F M_LN2 The value of loge 2.

6MAD (Multiply-ADd) is a ternary operation, the result is MAD(a, b, c)≈ a*b + c. The MAD functions compute the result of the operation approximately, but much faster than computing the multiplication and the addition. Since this elementary operations is very common in a wide range of algorithms, some processors already has the variants of this operations in their low-level instruction set. When the MAD operation is used many times in a program, the use of the approximating function can greatly accelerate the application if the available highest computational precision is not required.

float double Description

M_LN10_F M_LN10 The value of loge 10.

M_PI_F M_PI The value of π.

M_PI_2_F M_PI_2 The value of π/2.

M_PI_4_F M_PI_4 The value of π/4.

M_1_PI_F M_1_PI The value of 1/π.

M_2_PI_F M_2_PI The value of 2/π.

M_2_SQRTPI_F M_2_SQRTPI The value of .

M_SQRT2_F M_SQRT2 The value of .

M_SQRT1_2_F M_SQRT1_2

The value of .

The β function is a very important element of probability theory and statistics, defined by the Γ function, in the following way:

(5.1)

In the next sample code the β function is evaluated for several combinations of the x and y parameters. The arguments of the kernel function are the arrays containing the x and y values, and another array is passed to store the results (output). The assumption is that the index range of parallel execution is one-dimensional, and the size of the range equals the size of the array arguments.

Example 5.12. betaKernel.k

__kernel void betaKernel(__global float* input1, __global float* input2, __global float*

output) {

uint idx= get_global_id(0);

output[idx]= tgamma(input1[idx])*tgamma(input2[idx])/tgamma(input1[idx] + input2[idx]);

}

Note that the function tgamma used to evaluate the Γ function is not part of the header math.h. If the aim of an ANSI C application is to apply statistical probes or evaluate functions defined by the Γ function (like the β function, χ² probability density or cumulative distribution functions), third party numerical libraries have to be used.

There are also many operations implemented in functions for signed and unsigned, atomic and vector integer types. These functions are briefly overviewed in the following list:

• absolute value functions (abs, abs_diff),

• addition and its variants (add_sat, hadd),

• multiplication and its variants (mul_hi, mul24),

• finding minimum and maximum (min, max, clamp, etc.),

• the MAD operation and its variants (mad_hi, mad_sat, mad24, etc.),

• "upsample(hi, lo)": result= (hi << sizeof(hi)/2) | lo (upsample),

• and finally, a handy function to determine the number of bits set in an integer: popcount.

The lower and upper bounds of the ranges of integer types are defined as constants with talkative names:

type minimum maximum

char CHAR_MIN CHAR_MAX

char SCHAR_MIN SCHAR_MAX

int INT_MIN INT_MAX

long LONG_MIN LONG_MAX

short SHRT_MIN SHRT_MAX

uchar - ^UCHAR_MAX

ushort - ^USHRT_MAX

uint - ^UINT_MAX

ulong - ^ULONG_MAX

There is a further constant called CHAR_BIT meaning the number of bits used to represent character types. The values of the constants are available in the host program by using the prefix CL.

The next sample code implements the following functionality: Two arrays containing unsigned long integers are given: array input of length n, and array x of length m. An output array (result) of size m used to store integers is also given. The value written to results[idx] is computed in the following way⁷: consider the number x[idx] and determine the greatest number of bits set in the same time when x[idx] is compared to input[i], i∈ [0,... n-1]. One can assume that the size of the index range equals to the size of the array x.

Example 5.13. commonBits.k

__kernel void bitknn(__global ulong* input, uint n, __global ulong* x, __global uint*

result) {

uint idx= get_global_id(0);

int i;

uint maxCommonBits= 0;

for ( i= 0; i < n; ++i )

maxCommonBits= max(popcount(input[i] & x[idx]), maxCommonBits);

result[idx]= maxCommonBits;

}

In document György Kovács OpenCL (Pldal 122-125)