ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/OpenCL/OpenCL.pm
(Generate patch)

Comparing OpenCL/OpenCL.pm (file contents):
Revision 1.5 by root, Wed Nov 16 00:35:30 2011 UTC vs.
Revision 1.16 by root, Thu Nov 17 03:42:03 2011 UTC

6 6
7 use OpenCL; 7 use OpenCL;
8 8
9=head1 DESCRIPTION 9=head1 DESCRIPTION
10 10
11This is an early release which might be useful, but hasn't seen any testing. 11This is an early release which might be useful, but hasn't seen much testing.
12 12
13=head2 OpenCL FROM 10000 FEET HEIGHT
14
15Here is a high level overview of OpenCL:
16
17First you need to find one or more OpenCL::Platforms (kind of like
18vendors) - usually there is only one.
19
20Each platform gives you access to a number of OpenCL::Device objects, e.g.
21your graphics card.
22
23From a platform and some device(s), you create an OpenCL::Context, which is
24a very central object in OpenCL: Once you have a context you can create
25most other objects:
26
27OpenCL::Program objects, which store source code and, after building for a
28specific device ("compiling and linking"), also binary programs. For each
29kernel function in a program you can then create an OpenCL::Kernel object
30which represents basically a function call with argument values.
31
32OpenCL::Memory objects of various flavours: OpenCL::Buffers objects (flat
33memory areas, think arrays or structs) and OpenCL::Image objects (think 2d
34or 3d array) for bulk data and input and output for kernels.
35
36OpenCL::Sampler objects, which are kind of like texture filter modes in
37OpenGL.
38
39OpenCL::Queue objects - command queues, which allow you to submit memory
40reads, writes and copies, as well as kernel calls to your devices. They
41also offer a variety of methods to synchronise request execution, for
42example with barriers or OpenCL::Event objects.
43
44OpenCL::Event objects are used to signal when something is complete.
45
13=head1 HELPFUL RESOURCES 46=head2 HELPFUL RESOURCES
14 47
15The OpenCL spec used to develop this module (1.2 spec was available, but 48The OpenCL spec used to develop this module (1.2 spec was available, but
16no implementation was available to me :). 49no implementation was available to me :).
17 50
18 http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf 51 http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf
19 52
20OpenCL manpages: 53OpenCL manpages:
21 54
22 http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/ 55 http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/
23 56
57Here's a tutorial from AMD (very AMD-centric, too), not sure how useful it
58is, but at least it's free of charge:
59
60 http://developer.amd.com/zones/OpenCLZone/courses/Documents/Introduction_to_OpenCL_Programming%20Training_Guide%20%28201005%29.pdf
61
62If you are into UML class diagrams, the following diagram might help - if
63not, it will be mildly cofusing:
64
65 http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/classDiagram.html
66
67=head1 BASIC WORKFLOW
68
69To get something done, you basically have to do this once (refer to the
70examples below for actual code, this is just a high-level description):
71
72Find some platform (e.g. the first one) and some device(s) (e.g. the first
73device of the platform), and create a context from those.
74
75Create program objects from your OpenCL source code, then build (compile)
76the programs for each device you want to run them on.
77
78Create kernel objects for all kernels you want to use (surprisingly, these
79are not device-specific).
80
81Then, to execute stuff, you repeat these steps, possibly resuing or
82sharing some buffers:
83
84Create some input and output buffers from your context. Set these as
85arguments to your kernel.
86
87Enqueue buffer writes to initialise your input buffers (when not
88initialised at creation time).
89
90Enqueue the kernel execution.
91
92Enqueue buffer reads for your output buffer to read results.
93
24=head1 EXAMPLES 94=head1 EXAMPLES
25 95
26=head2 Enumerate all devices and get contexts for them. 96=head2 Enumerate all devices and get contexts for them.
27 97
98Best run this once to get a feel for the platforms and devices in your
99system.
100
28 for my $platform (OpenCL::platforms) { 101 for my $platform (OpenCL::platforms) {
29 warn $platform->info (OpenCL::PLATFORM_NAME); 102 printf "platform: %s\n", $platform->info (OpenCL::PLATFORM_NAME);
30 warn $platform->info (OpenCL::PLATFORM_EXTENSIONS); 103 printf "extensions: %s\n", $platform->info (OpenCL::PLATFORM_EXTENSIONS);
31 for my $device ($platform->devices) { 104 for my $device ($platform->devices) {
32 warn $device->info (OpenCL::DEVICE_NAME); 105 printf "+ device: %s\n", $device->info (OpenCL::DEVICE_NAME);
33 my $ctx = $device->context_simple; 106 my $ctx = $device->context;
34 # do stuff 107 # do stuff
35 } 108 }
36 } 109 }
37 110
38=head2 Get a useful context and a command queue. 111=head2 Get a useful context and a command queue.
39 112
40 my $dev = ((OpenCL::platforms)[0]->devices)[0]; 113This is a useful boilerplate for any OpenCL program that only wants to use
41 my $ctx = $dev->context_simple; 114one device,
42 my $queue = $ctx->command_queue_simple ($dev); 115
116 my ($platform) = OpenCL::platforms; # find first platform
117 my ($dev) = $platform->devices; # find first device of platform
118 my $ctx = $platform->context (undef, [$dev]); # create context out of those
119 my $queue = $ctx->queue ($dev); # create a command queue for the device
43 120
44=head2 Print all supported image formats of a context. 121=head2 Print all supported image formats of a context.
45 122
123Best run this once for your context, to see whats available and how to
124gather information.
125
46 for my $type (OpenCL::MEM_OBJECT_IMAGE2D, OpenCL::MEM_OBJECT_IMAGE3D) { 126 for my $type (OpenCL::MEM_OBJECT_IMAGE2D, OpenCL::MEM_OBJECT_IMAGE3D) {
47 say "supported image formats for ", OpenCL::enum2str $type; 127 print "supported image formats for ", OpenCL::enum2str $type, "\n";
48 128
49 for my $f ($ctx->supported_image_formats (0, $type)) { 129 for my $f ($ctx->supported_image_formats (0, $type)) {
50 printf " %-10s %-20s\n", OpenCL::enum2str $f->[0], OpenCL::enum2str $f->[1]; 130 printf " %-10s %-20s\n", OpenCL::enum2str $f->[0], OpenCL::enum2str $f->[1];
51 } 131 }
52 } 132 }
55then asynchronously. 135then asynchronously.
56 136
57 my $buf = $ctx->buffer_sv (OpenCL::MEM_COPY_HOST_PTR, "helmut"); 137 my $buf = $ctx->buffer_sv (OpenCL::MEM_COPY_HOST_PTR, "helmut");
58 138
59 $queue->enqueue_read_buffer ($buf, 1, 1, 3, my $data); 139 $queue->enqueue_read_buffer ($buf, 1, 1, 3, my $data);
60 warn $data; 140 print "$data\n";
61 141
62 my $ev = $queue->enqueue_read_buffer ($buf, 0, 1, 3, my $data); 142 my $ev = $queue->enqueue_read_buffer ($buf, 0, 1, 3, my $data);
63 $ev->wait; 143 $ev->wait;
64 warn $data; 144 print "$data\n"; # prints "elm"
65 145
66=head2 Create and build a program, then create a kernel out of one of its 146=head2 Create and build a program, then create a kernel out of one of its
67functions. 147functions.
68 148
69 my $src = ' 149 my $src = '
70 __kernel void 150 __kernel void
71 squareit (__global float *input, __global float *output) 151 squareit (__global float *input, __global float *output)
72 { 152 {
73 size_t id = get_global_id (0); 153 $id = get_global_id (0);
74 output [id] = input [id] * input [id]; 154 output [id] = input [id] * input [id];
75 } 155 }
76 '; 156 ';
77 157
78 my $prog = $ctx->program_with_source ($src); 158 my $prog = $ctx->program_with_source ($src);
79 159
160 # build croaks on compile errors, so catch it and print the compile errors
80 eval { $prog->build ($dev); 1 } 161 eval { $prog->build ($dev); 1 }
81 or die $prog->build_info ($dev, OpenCL::PROGRAM_BUILD_LOG); 162 or die $prog->build_info ($dev, OpenCL::PROGRAM_BUILD_LOG);
82 163
83 my $kernel = $prog->kernel ("squareit"); 164 my $kernel = $prog->kernel ("squareit");
84 165
85=head2 Create some input and output float buffers, then call squareit on them. 166=head2 Create some input and output float buffers, then call the
167'squareit' kernel on them.
86 168
87 my $input = $ctx->buffer_sv (OpenCL::MEM_COPY_HOST_PTR, pack "f*", 1, 2, 3, 4.5); 169 my $input = $ctx->buffer_sv (OpenCL::MEM_COPY_HOST_PTR, pack "f*", 1, 2, 3, 4.5);
88 my $output = $ctx->buffer (0, OpenCL::SIZEOF_FLOAT * 5); 170 my $output = $ctx->buffer (0, OpenCL::SIZEOF_FLOAT * 5);
89 171
90 # set buffer 172 # set buffer
96 178
97 # enqueue a synchronous read 179 # enqueue a synchronous read
98 $queue->enqueue_read_buffer ($output, 1, 0, OpenCL::SIZEOF_FLOAT * 4, my $data); 180 $queue->enqueue_read_buffer ($output, 1, 0, OpenCL::SIZEOF_FLOAT * 4, my $data);
99 181
100 # print the results: 182 # print the results:
101 say join ", ", unpack "f*", $data; 183 printf "%s\n", join ", ", unpack "f*", $data;
102 184
103=head2 The same enqueue operations as before, but assuming an out-of-order queue, 185=head2 The same enqueue operations as before, but assuming an out-of-order queue,
104showing off barriers. 186showing off barriers.
105 187
106 # execute it for all 4 numbers 188 # execute it for all 4 numbers
129 211
130=head1 DOCUMENTATION 212=head1 DOCUMENTATION
131 213
132=head2 BASIC CONVENTIONS 214=head2 BASIC CONVENTIONS
133 215
134This is not a 1:1 C-style translation of OpenCL to Perl - instead I 216This is not a one-to-one C-style translation of OpenCL to Perl - instead
135attempted to make the interface as type-safe as possible and introducing 217I attempted to make the interface as type-safe as possible by introducing
136object syntax where it makes sense. There are a number of important 218object syntax where it makes sense. There are a number of important
137differences between the OpenCL C API and this module: 219differences between the OpenCL C API and this module:
138 220
139=over 4 221=over 4
140 222
145=item * OpenCL uses CamelCase for function names (C<clGetPlatformInfo>), 227=item * OpenCL uses CamelCase for function names (C<clGetPlatformInfo>),
146while this module uses underscores as word separator and often leaves out 228while this module uses underscores as word separator and often leaves out
147prefixes (C<< $platform->info >>). 229prefixes (C<< $platform->info >>).
148 230
149=item * OpenCL often specifies fixed vector function arguments as short 231=item * OpenCL often specifies fixed vector function arguments as short
150arrays (C<size_t origin[3]>), while this module explicitly expects the 232arrays (C<$origin[3]>), while this module explicitly expects the
151components as separate arguments- 233components as separate arguments-
152 234
235=item * Structures are often specified with their components, and returned
236as arrayrefs.
237
153=item * Where possible, the row_pitch value is calculated from the perl 238=item * Where possible, one of the pitch values is calculated from the
154scalar length and need not be specified. 239perl scalar length and need not be specified.
155 240
156=item * When enqueuing commands, the wait list is specified by adding 241=item * When enqueuing commands, the wait list is specified by adding
157extra arguments to the function - everywhere a C<$wait_events...> argument 242extra arguments to the function - anywhere a C<$wait_events...> argument
158is documented this can be any number of event objects. 243is documented this can be any number of event objects.
159 244
160=item * When enqueuing commands, if the enqueue method is called in void 245=item * When enqueuing commands, if the enqueue method is called in void
161context, no event is created. In all other contexts an event is returned 246context, no event is created. In all other contexts an event is returned
162by the method. 247by the method.
165other status is returned the function will throw an exception, so you 250other status is returned the function will throw an exception, so you
166don't normally have to to any error checking. 251don't normally have to to any error checking.
167 252
168=back 253=back
169 254
255=head2 PERL AND OPENCL TYPES
256
257This handy(?) table lists OpenCL types and their perl, PDL and pack/unpack
258format equivalents:
259
260 OpenCL perl PDL pack/unpack
261 char IV - c
262 uchar IV byte C
263 short IV short s
264 ushort IV ushort S
265 int IV long? l
266 uint IV - L
267 long IV longlong q
268 ulong IV - Q
269 float NV float f
270 half IV ushort S
271 double NV double d
272
170=head2 THE OpenCL PACKAGE 273=head2 THE OpenCL PACKAGE
171 274
172=over 4 275=over 4
173 276
174=item $int = OpenCL::errno 277=item $int = OpenCL::errno
175 278
176The last error returned by a function - it's only changed on errors. 279The last error returned by a function - it's only valid after an error occured
280and before calling another OpenCL function.
177 281
178=item $str = OpenCL::err2str $errval 282=item $str = OpenCL::err2str $errval
179 283
180Comverts an error value into a human readable string. 284Comverts an error value into a human readable string.
181 285
182=item $str = OpenCL::err2str $enum 286=item $str = OpenCL::enum2str $enum
183 287
184Converts most enum values (inof parameter names, image format constants, 288Converts most enum values (inof parameter names, image format constants,
185object types, addressing and filter modes, command types etc.) into a 289object types, addressing and filter modes, command types etc.) into a
186human readbale string. When confronted with some random integer it can be 290human readbale string. When confronted with some random integer it can be
187very helpful to pass it through this function to maybe get some readable 291very helpful to pass it through this function to maybe get some readable
191 295
192Returns all available OpenCL::Platform objects. 296Returns all available OpenCL::Platform objects.
193 297
194L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetPlatformIDs.html> 298L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetPlatformIDs.html>
195 299
196=item $ctx = OpenCL::context_from_type_simple $type = OpenCL::DEVICE_TYPE_DEFAULT 300=item $ctx = OpenCL::context_from_type $properties, $type = OpenCL::DEVICE_TYPE_DEFAULT, $notify = undef
197 301
198Tries to create a context from a default device and platform - never worked for me. 302Tries to create a context from a default device and platform - never worked for me.
199 303
200L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContextFromType.html> 304L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContextFromType.html>
201 305
221 325
222=item @devices = $platform->devices ($type = OpenCL::DEVICE_TYPE_ALL) 326=item @devices = $platform->devices ($type = OpenCL::DEVICE_TYPE_ALL)
223 327
224Returns a list of matching OpenCL::Device objects. 328Returns a list of matching OpenCL::Device objects.
225 329
226=item $ctx = $platform->context_from_type_simple ($type = OpenCL::DEVICE_TYPE_DEFAULT) 330=item $ctx = $platform->context_from_type ($properties, $type = OpenCL::DEVICE_TYPE_DEFAULT, $notify = undef)
227 331
228Tries to create a context. Never worked for me. 332Tries to create a context. Never worked for me, and you need devices explitly anyway.
229 333
230L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContextFromType.html> 334L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContextFromType.html>
231 335
336=item $ctx = $device->context ($properties = undef, @$devices, $notify = undef)
337
338Create a new OpenCL::Context object using the given device object(s)- a
339CL_CONTEXT_PLATFORM property is supplied automatically.
340
341L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContext.html>
342
232=back 343=back
233 344
234=head2 THE OpenCL::Device CLASS 345=head2 THE OpenCL::Device CLASS
235 346
236=over 4 347=over 4
239 350
240See C<< $platform->info >> for details. 351See C<< $platform->info >> for details.
241 352
242L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetDeviceInfo.html> 353L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetDeviceInfo.html>
243 354
244=item $ctx = $device->context_simple
245
246Convenience function to create a new OpenCL::Context object.
247
248L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContext.html>
249
250=back 355=back
251 356
252=head2 THE OpenCL::Context CLASS 357=head2 THE OpenCL::Context CLASS
253 358
254=over 4 359=over 4
257 362
258See C<< $platform->info >> for details. 363See C<< $platform->info >> for details.
259 364
260L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetContextInfo.html> 365L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetContextInfo.html>
261 366
262=item $queue = $ctx->command_queue_simple ($device) 367=item $queue = $ctx->queue ($device, $properties)
263 368
264Convenience function to create a new OpenCL::Queue object from the context and the given device. 369Create a new OpenCL::Queue object from the context and the given device.
265 370
266L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateCommandQueue.html> 371L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateCommandQueue.html>
267 372
268=item $ev = $ctx->user_event 373=item $ev = $ctx->user_event
269 374
327They also allow you to specify any number of other event objects that this 432They also allow you to specify any number of other event objects that this
328request has to wait for before it starts executing, by simply passing the 433request has to wait for before it starts executing, by simply passing the
329event objects as extra parameters to the enqueue methods. 434event objects as extra parameters to the enqueue methods.
330 435
331Queues execute in-order by default, without any parallelism, so in most 436Queues execute in-order by default, without any parallelism, so in most
332cases it's not necessary to wait for or create event objects. 437cases (i.e. you use only one queue) it's not necessary to wait for or
438create event objects.
333 439
334=over 4 440=over 4
335 441
336=item $packed_value = $ctx->info ($name) 442=item $packed_value = $ctx->info ($name)
337 443
361 467
362=item $ev = $queue->enqueue_write_image ($src, $blocking, $x, $y, $z, $width, $height, $depth, $row_pitch, $data, $wait_events...) 468=item $ev = $queue->enqueue_write_image ($src, $blocking, $x, $y, $z, $width, $height, $depth, $row_pitch, $data, $wait_events...)
363 469
364L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueWriteImage.html> 470L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueWriteImage.html>
365 471
366=item $ev = $queue->enqueue_copy_buffer_rect ($src, $dst, $src_x, $src_y, $src_z, $dst_x, $dst_y, $dst_z, $width, $height, $depth, $src_row_pitch, $src_slice_pitch, 4dst_row_pitch, $dst_slice_pitch, $ait_event...) 472=item $ev = $queue->enqueue_copy_buffer_rect ($src, $dst, $src_x, $src_y, $src_z, $dst_x, $dst_y, $dst_z, $width, $height, $depth, $src_row_pitch, $src_slice_pitch, 4dst_row_pitch, $dst_slice_pitch, $wait_event...)
367 473
368Yeah. 474Yeah.
369 475
370L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyBufferRect.html> 476L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyBufferRect.html>
371 477
372=item $ev = $queue->enqueue_copy_buffer_to_image (OpenCL::Buffer src, OpenCL::Image dst, size_t src_offset, size_t dst_x, size_t dst_y, size_t dst_z, size_t width, size_t height, size_t depth, ...) 478=item $ev = $queue->enqueue_copy_buffer_to_image ($src_buffer, $dst_image, $src_offset, $dst_x, $dst_y, $dst_z, $width, $height, $depth, $wait_events...)
373 479
374L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyBufferToImage.html>. 480L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyBufferToImage.html>.
375 481
376=item $ev = $queue->enqueue_copy_image (OpenCL::Image src, OpenCL::Buffer dst, size_t src_x, size_t src_y, size_t src_z, size_t dst_x, size_t dst_y, size_t dst_z, size_t width, size_t height, size_t depth, ...) 482=item $ev = $queue->enqueue_copy_image ($src_image, $dst_image, $src_x, $src_y, $src_z, $dst_x, $dst_y, $dst_z, $width, $height, $depth, $wait_events...)
377 483
378L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyImage.html> 484L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyImage.html>
379 485
380=item $ev = $queue->enqueue_copy_image_to_buffer (OpenCL::Image src, OpenCL::Buffer dst, size_t src_x, size_t src_y, size_t src_z, size_t width, size_t height, size_t depth, size_t dst_offset, ...) 486=item $ev = $queue->enqueue_copy_image_to_buffer ($src_image, $dst_image, $src_x, $src_y, $src_z, $width, $height, $depth, $dst_offset, $wait_events...)
381 487
382L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyImageToBuffer.html> 488L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyImageToBuffer.html>
383 489
384=item $ev = $queue->enqueue_task ($kernel, $wait_events...) 490=item $ev = $queue->enqueue_task ($kernel, $wait_events...)
385 491
550package OpenCL; 656package OpenCL;
551 657
552use common::sense; 658use common::sense;
553 659
554BEGIN { 660BEGIN {
555 our $VERSION = '0.03'; 661 our $VERSION = '0.14';
556 662
557 require XSLoader; 663 require XSLoader;
558 XSLoader::load (__PACKAGE__, $VERSION); 664 XSLoader::load (__PACKAGE__, $VERSION);
559 665
560 @OpenCL::Buffer::ISA = 666 @OpenCL::Buffer::ISA =

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines