• Nem Talált Eredményt

Arithmetic conversion

In document György Kovács OpenCL (Pldal 116-120)

clGetDeviceInfo and the types and descriptions of the properties

Chapter 5. The OpenCL C language

2. Data types

3.1. Arithmetic conversion

One of the most important questions related to arithmetical expressions is that whenever conversion is to be applied to the operands of an operation which one of the operands is converted and what is the type of the result of the operation? The questions are answered briefly in the following list. Note that not all the cases are covered, but following the rules specified below, one can write correct codes. The special cases can be found in the OpenCL specification.

• The conversion of atomic types follows the rules of the C99 standard.

• There is no way to perform arithmetic operations on different vector types, since the implicit conversion of vector types is not allowed.

• If the expression contains only one vector type variable, the atomic type (T1) is converted to the vector type (T2) if one of the following conditions are met: the precision of the floating-point type T1 is greater than that of T2; T1 is floating-point type and T2 is integer; the precision of the integer type T1 is greater than that of T2; T1 and T2 are integer types with the same precision but T1 is unsigned; T2 is logical type.

3.2. Operators

The operators can be used according to the rules described previously. The operands can be both atomic and vector types:

• if both operands are atomic types, implicit conversion is carried out;

• when one of the operands is atomic and the other one is vector type, conversion is carried out according to the rules described in the previous section;

• when both of the operands are vector types, compilation error occurs unless the types are the same.

When at least one of the operands is vector type, the type of the result is also a vector, and the operations are carried out componentwise.

3.2.1. Arithmetic operators

Arithmetical addition (+), subtraction (-), multiplication (*) and division (/) can be applied to any types if they can be converted by the rules described above. The operators + and - can be used to any built-in types.

However, the incrementing (++) and decrementing () operators can be applied to built-in integer types only, floating-point values cannot be incremented, nor decremented by the operators ++ and .

3.2.2. Comparison and logical operators

The result of comparison operators (>, <, >=, <=, ==, !=) result in an int value when the operands are atomic types. The result is 1 if the relation holds, and 0 otherwise. When the operands are vector types of n components, the result is a vector of signed integers with the same length of representation3 the value of the ith component is -1 if the relation holds for the ith components of the vectors, and 0 otherwise. The bitwise operators (&, |, ^, ~) can be used for all the built-in integer types. The logical operators (&&, ||) are following the rule of short-circuit evaluation only when the operands are from atomic types. In the case of vector operands, both of the operands are evaluated. The result of the operator follows the rules of comparison operators. The operator ! can be applied for both atomic and vector types, and the value of the result follows the rules of comparison operators.

3.2.3. Other operators

3For example, when charn or ucharn vectors are compared, the result is type charn. When longn, ulongn or doublen vectors are compared, the type of the result is longn.

The first operand of the ternary operator (?:) can be atomic and vector type, as well. In the case of atomic types, the operator follows the rule of the C99 standard, for vector types the result is computed componentwise. When this operator is used, one must take care to keep the rules of implicit conversion rules. The shifting operators (,

) can be applied for any integer atomic or vector types. In the latter case, the result is computed componentwise. The , operator follows the rules of the C99 standard: the expression on left and right hand sides of the operator are evaluated from right to left and the result of the operator is the value of the expression on the right hand side. The indirection (*) and address-of (&) operators follow the rules specified in the C99 standard.

The assignment operator (=) and its variants (+=, -=, *=, /=, %=, =, =, &=, |=, ^=) can be used with the same syntax and semantics as specified in the C99 standard, however the programmer must be careful to keep the rules of implicit conversion.

The use of operators can be easily interpreted. With a few exceptions, the operators for atomic types follow the rules of the C99 standard. However, in the case of vector operands one must be careful with the limited capabilities of implicit conversion. The use of some operators is presented in the following sample code.

Example 5.8. operators.k:3-11

k= i*j; //k == (9, 16, 21, 24) k= i%j; //k == (0, 0, 0, 0) ++k; //k == (1, 1, 1, 1) k+= i; //k == (2, 3, 4, 5) k= k + 2; //k == (4, 5, 6, 7) j= k < j; //j == (1, 1, 1, 0) k= !(i == j); //k == (0, 1, 1, 1) }

4. Qualifiers

Beside the half, bool and vector data types, another novelty of OpenCL C is a set of qualifiers related to address spaces, kernels and accessing privileges. These qualifiers are present as keywords in the language.

When the memory model of OpenCL devices was introduced, four types of memory were described: global, constant, local and private. At the conceptual level, the main difference occurs in the accessing privileges provided for the host program and the workitems. At the physical level, the different kinds of memories can reside in different parts of the physical device, thus, there are significant differences in size and in the speed of reading and writing the various types of memories. Usually, the reading of the constant memory is much faster than that of the global memory. However the access of the local and private memories can be even faster. The address spaces corresponding to the different memories are identified by qualifiers: __global, __constant, __local and __provate. These keywords can be used in declarations and on the list of arguments in function specifications. Semantically, the keywords specify the memory where the object has to be allocated.

Syntactically, they can be used just as type modifiers.

• The variables declared by the __global modifier can take values from the address space of the global memory. The __global qualifier may appear on the list of arguments and in the declarations of local variables, as well.

• The pointers qualified by __constant can take values from the address space of the constant memory. The program-level constants are to be declared by the qualifier __constant, and can be initialized by literals or constant expressions in compilation time. Writing constants qualified by __constant in runtime results errors.

• Variables declared with __local are created in the local memory and pointers declared with __local take values from the address space of the local memories. The regions allocated in the local memory can be accessed by each workitem in a workgroup, thus, shared variables can be implemented by them. Variables with __local qualifier can be declared only in kernel functions, and in each workgroup only one region of memory is allocated, even though all the workitems of the group seem to contain an individual declaration.

Variables declared with __local cannot be initialized in the declaration, but values can be assigned to them in distinct assignment statements.

• The variables declared with __private are allocated in the private memory and pointers with qualifier __private can take values from the address space of the private memory. By default, the local variables of kernel functions without address space qualifiers are __private, the formal arguments of both kernel and non-kernel functions are also __private and the local variables in non-kernel functions can be only __private. Furthermore, when pointers are declared without address space qualifiers, they can take values from the address space of the private memory, only. In practice, the private memory is small, consisting of some registers only, thus, the access of private memory is extremely fast. However, the programmer should use private memory carefully: when too many variables are allocated in the private memory, no compilation neither runtime error occurs, but regions of the global memory are used to extend the size of private memory.

Since the reading of the global memory much slower than that of the private memory, a strong downturn can be expected in the performance of the application.

Beside the address space qualifiers, another keyword qualifying kernel function is introduced: __kernel. The kernel functions are highly similar to simple C functions, and provide the entry points for parallel executions on the OpenCL device.

The following sample code contains the definition of a valid kernel function we use to demonstrate the use cases of address space qualifiers.

Example 5.9. qualifiers.k

__constant int i= 1;

int functionExample(__local int* l) {

*l= 0;

return 1;

}

__kernel void kernelExample(__global float* g, __constant float* c, __local float* l) {

__local int t;

*g= p= functionExample(&t) + *c + *l + i;

}

In the first line of the code the program-level constant i is declared and initialized with value 1. The only argument of function functionExample is a pointer of integers allocated in the local memory, and the function kernelExample takes three addresses as arguments, from the global, constant and local memories, respectively.

The first step of function kernelExample is to declare the variable t in the local memory. Note that all the workitems of a workgroup will refer the same region of memory by this variable. In the next line the variable p is created in the private memory. Then, a complex arithmetical expression is evaluated, using all the variables, arguments and the function functionExample, as well.

The use of the kernel and address space qualifiers is easy, however, some rules have to be followed to avoid compilation errors:

1. The return value of kernel functions is always void. A compilation error occurs if the return value of kernelExample in the previous sample code is replaced by the type int:

clBuildProgram: Program build failure.

:9:1: error: a __kernel function must have void return type

__kernel int kernelExample(__global float* g, __constant float* c, __local float* l)

^

2. Kernel functions cannot take pointers to pointers as argument. However, in the body of the kernel function one can use this construction, and non-kernel functions can take arguments being pointers to pointers.

Compilation error occures if the type of the first parameter of function kernelExample is replaced by __global float**:

clBuildProgram: Program build failure.

:9:44: error: invalid address space for pointee of pointer argument to __kernel

function

__kernel void kernelExample(__global float** g, __constant float* c, __local float* l) 3. Kernel functions cannot take arguments with the types bool, half, size_t, ptrdiff_t, intptr_t and

uintptr_t, neither composite data types having fields with the types mentioned above. Compilation error occurs when the type of the first parameter of the kernel function is replaced to bool:

clBuildProgram: Program build failure.

:9:34: error: __kernel function cannot have argument whose type is, or contains, type _Bool

__kernel void kernelExample(bool g, __constant float* c, __local float* l)

4. Neither kernel, nor non-kernel functions are allowed to call recursively. When the return value l of the function functionExample is replaced by functionExample(l);, the compiler generates a runtime error and the host program terminates with error code.

5. Function pointers are not allowed to use.

6. All the pointer arguments of kernel functions have to be qualified by __global, __constant or __local, that is, kernel functions cannot take pointers to the private memory as arguments. A compilation error occurs if the qualifier __global is left out from the specification of the first argument of kernelExample.

lBuildProgram: Program build failure.

:9:34: error: invalid address space for pointee of pointer argument to __kernel function

__kernel void kernelExample(float* g, __constant float* c, __local float* l)

7. The variables created in the local memory cannot be initialized in the declaration. Compilation error occurs if the declaration of variable t in function kernelExample is extended by an initialization (__local int t=

1):

clBuildProgram: Program build failure.

:11:15: error: __local variables cannot have initializers __local int t= 1;

^

8. Arguments of non-pointer types cannot be qualified by __global, __constant or __local. Compilation error occurs when the types of the arguments of the function kernelExample are replaced by __global float, __constant float and __local float, respectively:

clBuildProgram: Program build failure.

:9:44: error: invalid address space for argument to __kernel function

__kernel void kernelExample(__global float g, __constant float c, __local float l) ^

:9:44: error: parameter may not be qualified with an address space :9:64: error: invalid address space for argument to __kernel function

__kernel void kernelExample(__global float g, __constant float c, __local float l) ^

:9:64: error: parameter may not be qualified with an address space :9:81: error: invalid address space for argument to __kernel function

__kernel void kernelExample(__global float g, __constant float c, __local float l) ^ 9. Non-pointer local variables cannot be declared with qualifiers __global and __constant. Compilation

error occurs when non-pointer local variables are declared with qualifiers __global or __constant in either kernelExample or functionExample:

clBuildProgram: Program build failure.

:5:16: error: automatic variable qualified with an address space __global int a;

^

:6:18: error: automatic variable qualified with an address space __constant int b;

^

:16:16: error: automatic variable qualified with an address space __global int a;

^

:17:18: error: automatic variable qualified with an address space

__constant int b;

^

10. Neither implicit conversion, nor assignment is allowed between pointer types with different address space qualifiers. Compilation error occurs when a pair of pointers with different address space qualifiers are used on the left and right hand side of an assignment operator:

clBuildProgram: Program build failure.

:15:6: error: illegal implicit conversion between two pointers with different address spaces

g= c;

^

:15:4: warning: assigning to 'float __attribute__((address_space(1))) *' from 'float __attribute__((address_space(2))) *' discards qualifiers

g= c;

^ ~

:16:6: error: illegal implicit conversion between two pointers with different address spaces

g= l;

^

:16:4: warning: assigning to 'float __attribute__((address_space(1))) *' from 'float __attribute__((address_space(3))) *' discards qualifiers

g= l;

^ ~

:17:6: error: illegal implicit conversion between two pointers with different address spaces

g= pp;

^~

:17:4: warning: assigning to 'float __attribute__((address_space(1))) *' from 'float

*' discards qualifiers g= pp;

^ ~~

11. Program-level variables can be declared only with the qualifier __constant. Compilation error occurs when any other address space qualifier is used in the declaration of program-level variables:

clBuildProgram: Program build failure.

:2:14: error: program scope variables must be declared in the __constant address space __global int j;

^

:3:13: error: program scope variables must be declared in the __constant address space __local int k;

^

:4:15: error: program scope variables must be declared in the __constant address space __private int l; functions. Recursion and the use of function pointers is not allowed. The assignment of pointers with different address space qualifiers is not allowed.

In document György Kovács OpenCL (Pldal 116-120)