ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/OpenCL/OpenCL.pm
(Generate patch)

Comparing OpenCL/OpenCL.pm (file contents):
Revision 1.8 by root, Wed Nov 16 06:22:20 2011 UTC vs.
Revision 1.19 by root, Sat Nov 19 19:54:04 2011 UTC

8 8
9=head1 DESCRIPTION 9=head1 DESCRIPTION
10 10
11This is an early release which might be useful, but hasn't seen much testing. 11This is an early release which might be useful, but hasn't seen much testing.
12 12
13=head2 OpenCL FROM 10000 FEET HEIGHT
14
15Here is a high level overview of OpenCL:
16
17First you need to find one or more OpenCL::Platforms (kind of like
18vendors) - usually there is only one.
19
20Each platform gives you access to a number of OpenCL::Device objects, e.g.
21your graphics card.
22
23From a platform and some device(s), you create an OpenCL::Context, which is
24a very central object in OpenCL: Once you have a context you can create
25most other objects:
26
27OpenCL::Program objects, which store source code and, after building for a
28specific device ("compiling and linking"), also binary programs. For each
29kernel function in a program you can then create an OpenCL::Kernel object
30which represents basically a function call with argument values.
31
32OpenCL::Memory objects of various flavours: OpenCL::Buffers objects (flat
33memory areas, think arrays or structs) and OpenCL::Image objects (think 2d
34or 3d array) for bulk data and input and output for kernels.
35
36OpenCL::Sampler objects, which are kind of like texture filter modes in
37OpenGL.
38
39OpenCL::Queue objects - command queues, which allow you to submit memory
40reads, writes and copies, as well as kernel calls to your devices. They
41also offer a variety of methods to synchronise request execution, for
42example with barriers or OpenCL::Event objects.
43
44OpenCL::Event objects are used to signal when something is complete.
45
13=head1 HELPFUL RESOURCES 46=head2 HELPFUL RESOURCES
14 47
15The OpenCL spec used to develop this module (1.2 spec was available, but 48The OpenCL spec used to develop this module (1.2 spec was available, but
16no implementation was available to me :). 49no implementation was available to me :).
17 50
18 http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf 51 http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf
19 52
20OpenCL manpages: 53OpenCL manpages:
21 54
22 http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/ 55 http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/
23 56
57If you are into UML class diagrams, the following diagram might help - if
58not, it will be mildly cobfusing:
59
60 http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/classDiagram.html
61
62Here's a tutorial from AMD (very AMD-centric, too), not sure how useful it
63is, but at least it's free of charge:
64
65 http://developer.amd.com/zones/OpenCLZone/courses/Documents/Introduction_to_OpenCL_Programming%20Training_Guide%20%28201005%29.pdf
66
67And here's NVIDIA's OpenCL Best Practises Guide:
68
69 http://developer.download.nvidia.com/compute/cuda/3_2/toolkit/docs/OpenCL_Best_Practices_Guide.pdf
70
71=head1 BASIC WORKFLOW
72
73To get something done, you basically have to do this once (refer to the
74examples below for actual code, this is just a high-level description):
75
76Find some platform (e.g. the first one) and some device(s) (e.g. the first
77device of the platform), and create a context from those.
78
79Create program objects from your OpenCL source code, then build (compile)
80the programs for each device you want to run them on.
81
82Create kernel objects for all kernels you want to use (surprisingly, these
83are not device-specific).
84
85Then, to execute stuff, you repeat these steps, possibly resuing or
86sharing some buffers:
87
88Create some input and output buffers from your context. Set these as
89arguments to your kernel.
90
91Enqueue buffer writes to initialise your input buffers (when not
92initialised at creation time).
93
94Enqueue the kernel execution.
95
96Enqueue buffer reads for your output buffer to read results.
97
24=head1 EXAMPLES 98=head1 EXAMPLES
25 99
26=head2 Enumerate all devices and get contexts for them. 100=head2 Enumerate all devices and get contexts for them.
27 101
102Best run this once to get a feel for the platforms and devices in your
103system.
104
28 for my $platform (OpenCL::platforms) { 105 for my $platform (OpenCL::platforms) {
29 warn $platform->info (OpenCL::PLATFORM_NAME); 106 printf "platform: %s\n", $platform->info (OpenCL::PLATFORM_NAME);
30 warn $platform->info (OpenCL::PLATFORM_EXTENSIONS); 107 printf "extensions: %s\n", $platform->info (OpenCL::PLATFORM_EXTENSIONS);
31 for my $device ($platform->devices) { 108 for my $device ($platform->devices) {
32 warn $device->info (OpenCL::DEVICE_NAME); 109 printf "+ device: %s\n", $device->info (OpenCL::DEVICE_NAME);
33 my $ctx = $device->context_simple; 110 my $ctx = $device->context;
34 # do stuff 111 # do stuff
35 } 112 }
36 } 113 }
37 114
38=head2 Get a useful context and a command queue. 115=head2 Get a useful context and a command queue.
39 116
40 my $dev = ((OpenCL::platforms)[0]->devices)[0]; 117This is a useful boilerplate for any OpenCL program that only wants to use
41 my $ctx = $dev->context_simple; 118one device,
42 my $queue = $ctx->command_queue_simple ($dev); 119
120 my ($platform) = OpenCL::platforms; # find first platform
121 my ($dev) = $platform->devices; # find first device of platform
122 my $ctx = $platform->context (undef, [$dev]); # create context out of those
123 my $queue = $ctx->queue ($dev); # create a command queue for the device
43 124
44=head2 Print all supported image formats of a context. 125=head2 Print all supported image formats of a context.
45 126
127Best run this once for your context, to see whats available and how to
128gather information.
129
46 for my $type (OpenCL::MEM_OBJECT_IMAGE2D, OpenCL::MEM_OBJECT_IMAGE3D) { 130 for my $type (OpenCL::MEM_OBJECT_IMAGE2D, OpenCL::MEM_OBJECT_IMAGE3D) {
47 say "supported image formats for ", OpenCL::enum2str $type; 131 print "supported image formats for ", OpenCL::enum2str $type, "\n";
48 132
49 for my $f ($ctx->supported_image_formats (0, $type)) { 133 for my $f ($ctx->supported_image_formats (0, $type)) {
50 printf " %-10s %-20s\n", OpenCL::enum2str $f->[0], OpenCL::enum2str $f->[1]; 134 printf " %-10s %-20s\n", OpenCL::enum2str $f->[0], OpenCL::enum2str $f->[1];
51 } 135 }
52 } 136 }
55then asynchronously. 139then asynchronously.
56 140
57 my $buf = $ctx->buffer_sv (OpenCL::MEM_COPY_HOST_PTR, "helmut"); 141 my $buf = $ctx->buffer_sv (OpenCL::MEM_COPY_HOST_PTR, "helmut");
58 142
59 $queue->enqueue_read_buffer ($buf, 1, 1, 3, my $data); 143 $queue->enqueue_read_buffer ($buf, 1, 1, 3, my $data);
60 warn $data; 144 print "$data\n";
61 145
62 my $ev = $queue->enqueue_read_buffer ($buf, 0, 1, 3, my $data); 146 my $ev = $queue->enqueue_read_buffer ($buf, 0, 1, 3, my $data);
63 $ev->wait; 147 $ev->wait;
64 warn $data; 148 print "$data\n"; # prints "elm"
65 149
66=head2 Create and build a program, then create a kernel out of one of its 150=head2 Create and build a program, then create a kernel out of one of its
67functions. 151functions.
68 152
69 my $src = ' 153 my $src = '
70 __kernel void 154 __kernel void
71 squareit (__global float *input, __global float *output) 155 squareit (__global float *input, __global float *output)
72 { 156 {
73 size_t id = get_global_id (0); 157 $id = get_global_id (0);
74 output [id] = input [id] * input [id]; 158 output [id] = input [id] * input [id];
75 } 159 }
76 '; 160 ';
77 161
78 my $prog = $ctx->program_with_source ($src); 162 my $prog = $ctx->program_with_source ($src);
79 163
164 # build croaks on compile errors, so catch it and print the compile errors
80 eval { $prog->build ($dev); 1 } 165 eval { $prog->build ($dev); 1 }
81 or die $prog->build_info ($dev, OpenCL::PROGRAM_BUILD_LOG); 166 or die $prog->build_info ($dev, OpenCL::PROGRAM_BUILD_LOG);
82 167
83 my $kernel = $prog->kernel ("squareit"); 168 my $kernel = $prog->kernel ("squareit");
84 169
85=head2 Create some input and output float buffers, then call squareit on them. 170=head2 Create some input and output float buffers, then call the
171'squareit' kernel on them.
86 172
87 my $input = $ctx->buffer_sv (OpenCL::MEM_COPY_HOST_PTR, pack "f*", 1, 2, 3, 4.5); 173 my $input = $ctx->buffer_sv (OpenCL::MEM_COPY_HOST_PTR, pack "f*", 1, 2, 3, 4.5);
88 my $output = $ctx->buffer (0, OpenCL::SIZEOF_FLOAT * 5); 174 my $output = $ctx->buffer (0, OpenCL::SIZEOF_FLOAT * 5);
89 175
90 # set buffer 176 # set buffer
96 182
97 # enqueue a synchronous read 183 # enqueue a synchronous read
98 $queue->enqueue_read_buffer ($output, 1, 0, OpenCL::SIZEOF_FLOAT * 4, my $data); 184 $queue->enqueue_read_buffer ($output, 1, 0, OpenCL::SIZEOF_FLOAT * 4, my $data);
99 185
100 # print the results: 186 # print the results:
101 say join ", ", unpack "f*", $data; 187 printf "%s\n", join ", ", unpack "f*", $data;
102 188
103=head2 The same enqueue operations as before, but assuming an out-of-order queue, 189=head2 The same enqueue operations as before, but assuming an out-of-order queue,
104showing off barriers. 190showing off barriers.
105 191
106 # execute it for all 4 numbers 192 # execute it for all 4 numbers
129 215
130=head1 DOCUMENTATION 216=head1 DOCUMENTATION
131 217
132=head2 BASIC CONVENTIONS 218=head2 BASIC CONVENTIONS
133 219
134This is not a 1:1 C-style translation of OpenCL to Perl - instead I 220This is not a one-to-one C-style translation of OpenCL to Perl - instead
135attempted to make the interface as type-safe as possible and introducing 221I attempted to make the interface as type-safe as possible by introducing
136object syntax where it makes sense. There are a number of important 222object syntax where it makes sense. There are a number of important
137differences between the OpenCL C API and this module: 223differences between the OpenCL C API and this module:
138 224
139=over 4 225=over 4
140 226
141=item * Object lifetime managament is automatic - there is no need 227=item * Object lifetime managament is automatic - there is no need
142to free objects explicitly (C<clReleaseXXX>), the release function 228to free objects explicitly (C<clReleaseXXX>), the release function
143is called automatically once all Perl references to it go away. 229is called automatically once all Perl references to it go away.
144 230
145=item * OpenCL uses CamelCase for function names (C<clGetPlatformInfo>), 231=item * OpenCL uses CamelCase for function names (e.g. C<clGetPlatformIDs>, C<clGetPlatformInfo>),
146while this module uses underscores as word separator and often leaves out 232while this module uses underscores as word separator and often leaves out
147prefixes (C<< $platform->info >>). 233prefixes (C<OpenCL::platforms>, C<< $platform->info >>).
148 234
149=item * OpenCL often specifies fixed vector function arguments as short 235=item * OpenCL often specifies fixed vector function arguments as short
150arrays (C<size_t origin[3]>), while this module explicitly expects the 236arrays (C<size_t origin[3]>), while this module explicitly expects the
151components as separate arguments- 237components as separate arguments (C<$orig_x, $orig_y, $orig_z>) in
238function calls.
152 239
153=item * Where possible, the row_pitch value is calculated from the perl 240=item * Structures are often specified by flattening out their components
154scalar length and need not be specified. 241as with short vectors, and returned as arrayrefs.
155 242
156=item * When enqueuing commands, the wait list is specified by adding 243=item * When enqueuing commands, the wait list is specified by adding
157extra arguments to the function - everywhere a C<$wait_events...> argument 244extra arguments to the function - anywhere a C<$wait_events...> argument
158is documented this can be any number of event objects. 245is documented this can be any number of event objects.
159 246
160=item * When enqueuing commands, if the enqueue method is called in void 247=item * When enqueuing commands, if the enqueue method is called in void
161context, no event is created. In all other contexts an event is returned 248context, no event is created. In all other contexts an event is returned
162by the method. 249by the method.
189 276
190=over 4 277=over 4
191 278
192=item $int = OpenCL::errno 279=item $int = OpenCL::errno
193 280
194The last error returned by a function - it's only changed on errors. 281The last error returned by a function - it's only valid after an error occured
282and before calling another OpenCL function.
195 283
196=item $str = OpenCL::err2str $errval 284=item $str = OpenCL::err2str $errval
197 285
198Comverts an error value into a human readable string. 286Comverts an error value into a human readable string.
199 287
200=item $str = OpenCL::err2str $enum 288=item $str = OpenCL::enum2str $enum
201 289
202Converts most enum values (inof parameter names, image format constants, 290Converts most enum values (inof parameter names, image format constants,
203object types, addressing and filter modes, command types etc.) into a 291object types, addressing and filter modes, command types etc.) into a
204human readbale string. When confronted with some random integer it can be 292human readbale string. When confronted with some random integer it can be
205very helpful to pass it through this function to maybe get some readable 293very helpful to pass it through this function to maybe get some readable
209 297
210Returns all available OpenCL::Platform objects. 298Returns all available OpenCL::Platform objects.
211 299
212L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetPlatformIDs.html> 300L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetPlatformIDs.html>
213 301
214=item $ctx = OpenCL::context_from_type_simple $type = OpenCL::DEVICE_TYPE_DEFAULT 302=item $ctx = OpenCL::context_from_type $properties, $type = OpenCL::DEVICE_TYPE_DEFAULT, $notify = undef
215 303
216Tries to create a context from a default device and platform - never worked for me. 304Tries to create a context from a default device and platform - never worked for me.
217 305
218L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContextFromType.html> 306L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContextFromType.html>
219 307
239 327
240=item @devices = $platform->devices ($type = OpenCL::DEVICE_TYPE_ALL) 328=item @devices = $platform->devices ($type = OpenCL::DEVICE_TYPE_ALL)
241 329
242Returns a list of matching OpenCL::Device objects. 330Returns a list of matching OpenCL::Device objects.
243 331
244=item $ctx = $platform->context_from_type_simple ($type = OpenCL::DEVICE_TYPE_DEFAULT) 332=item $ctx = $platform->context_from_type ($properties, $type = OpenCL::DEVICE_TYPE_DEFAULT, $notify = undef)
245 333
246Tries to create a context. Never worked for me. 334Tries to create a context. Never worked for me, and you need devices explitly anyway.
247 335
248L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContextFromType.html> 336L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContextFromType.html>
249 337
338=item $ctx = $device->context ($properties = undef, @$devices, $notify = undef)
339
340Create a new OpenCL::Context object using the given device object(s)- a
341CL_CONTEXT_PLATFORM property is supplied automatically.
342
343L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContext.html>
344
250=back 345=back
251 346
252=head2 THE OpenCL::Device CLASS 347=head2 THE OpenCL::Device CLASS
253 348
254=over 4 349=over 4
257 352
258See C<< $platform->info >> for details. 353See C<< $platform->info >> for details.
259 354
260L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetDeviceInfo.html> 355L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetDeviceInfo.html>
261 356
262=item $ctx = $device->context_simple
263
264Convenience function to create a new OpenCL::Context object.
265
266L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContext.html>
267
268=back 357=back
269 358
270=head2 THE OpenCL::Context CLASS 359=head2 THE OpenCL::Context CLASS
271 360
272=over 4 361=over 4
275 364
276See C<< $platform->info >> for details. 365See C<< $platform->info >> for details.
277 366
278L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetContextInfo.html> 367L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetContextInfo.html>
279 368
280=item $queue = $ctx->command_queue_simple ($device) 369=item $queue = $ctx->queue ($device, $properties)
281 370
282Convenience function to create a new OpenCL::Queue object from the context and the given device. 371Create a new OpenCL::Queue object from the context and the given device.
283 372
284L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateCommandQueue.html> 373L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateCommandQueue.html>
285 374
286=item $ev = $ctx->user_event 375=item $ev = $ctx->user_event
287 376
297 386
298=item $buf = $ctx->buffer_sv ($flags, $data) 387=item $buf = $ctx->buffer_sv ($flags, $data)
299 388
300Creates a new OpenCL::Buffer object and initialise it with the given data values. 389Creates a new OpenCL::Buffer object and initialise it with the given data values.
301 390
302=item $img = $ctx->image2d ($flags, $channel_order, $channel_type, $width, $height, $data) 391=item $img = $ctx->image2d ($flags, $channel_order, $channel_type, $width, $height, $row_pitch = 0, $data = undef)
303 392
304Creates a new OpenCL::Image2D object and optionally initialises it with the given data values. 393Creates a new OpenCL::Image2D object and optionally initialises it with the given data values.
305 394
306L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateImage2D.html> 395L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateImage2D.html>
307 396
308=item $img = $ctx->image3d ($flags, $channel_order, $channel_type, $width, $height, $depth, $slice_pitch, $data) 397=item $img = $ctx->image3d ($flags, $channel_order, $channel_type, $width, $height, $depth, $row_pitch = 0, $slice_pitch = 0, $data = undef)
309 398
310Creates a new OpenCL::Image3D object and optionally initialises it with the given data values. 399Creates a new OpenCL::Image3D object and optionally initialises it with the given data values.
311 400
312L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateImage3D.html> 401L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateImage3D.html>
313 402
376 465
377=item $ev = $queue->enqueue_read_image ($src, $blocking, $x, $y, $z, $width, $height, $depth, $row_pitch, $slice_pitch, $data, $wait_events...) 466=item $ev = $queue->enqueue_read_image ($src, $blocking, $x, $y, $z, $width, $height, $depth, $row_pitch, $slice_pitch, $data, $wait_events...)
378 467
379L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueReadImage.html> 468L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueReadImage.html>
380 469
381=item $ev = $queue->enqueue_write_image ($src, $blocking, $x, $y, $z, $width, $height, $depth, $row_pitch, $data, $wait_events...) 470=item $ev = $queue->enqueue_write_image ($src, $blocking, $x, $y, $z, $width, $height, $depth, $row_pitch, $slice_pitch, $data, $wait_events...)
382 471
383L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueWriteImage.html> 472L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueWriteImage.html>
384 473
385=item $ev = $queue->enqueue_copy_buffer_rect ($src, $dst, $src_x, $src_y, $src_z, $dst_x, $dst_y, $dst_z, $width, $height, $depth, $src_row_pitch, $src_slice_pitch, 4dst_row_pitch, $dst_slice_pitch, $ait_event...) 474=item $ev = $queue->enqueue_copy_buffer_rect ($src, $dst, $src_x, $src_y, $src_z, $dst_x, $dst_y, $dst_z, $width, $height, $depth, $src_row_pitch, $src_slice_pitch, $dst_row_pitch, $dst_slice_pitch, $wait_event...)
386 475
387Yeah. 476Yeah.
388 477
389L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyBufferRect.html> 478L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyBufferRect.html>
390 479
391=item $ev = $queue->enqueue_copy_buffer_to_image (OpenCL::Buffer src, OpenCL::Image dst, size_t src_offset, size_t dst_x, size_t dst_y, size_t dst_z, size_t width, size_t height, size_t depth, ...) 480=item $ev = $queue->enqueue_copy_buffer_to_image ($src_buffer, $dst_image, $src_offset, $dst_x, $dst_y, $dst_z, $width, $height, $depth, $wait_events...)
392 481
393L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyBufferToImage.html>. 482L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyBufferToImage.html>.
394 483
395=item $ev = $queue->enqueue_copy_image (OpenCL::Image src, OpenCL::Buffer dst, size_t src_x, size_t src_y, size_t src_z, size_t dst_x, size_t dst_y, size_t dst_z, size_t width, size_t height, size_t depth, ...) 484=item $ev = $queue->enqueue_copy_image ($src_image, $dst_image, $src_x, $src_y, $src_z, $dst_x, $dst_y, $dst_z, $width, $height, $depth, $wait_events...)
396 485
397L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyImage.html> 486L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyImage.html>
398 487
399=item $ev = $queue->enqueue_copy_image_to_buffer (OpenCL::Image src, OpenCL::Buffer dst, size_t src_x, size_t src_y, size_t src_z, size_t width, size_t height, size_t depth, size_t dst_offset, ...) 488=item $ev = $queue->enqueue_copy_image_to_buffer ($src_image, $dst_image, $src_x, $src_y, $src_z, $width, $height, $depth, $dst_offset, $wait_events...)
400 489
401L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyImageToBuffer.html> 490L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyImageToBuffer.html>
402 491
403=item $ev = $queue->enqueue_task ($kernel, $wait_events...) 492=item $ev = $queue->enqueue_task ($kernel, $wait_events...)
404 493
569package OpenCL; 658package OpenCL;
570 659
571use common::sense; 660use common::sense;
572 661
573BEGIN { 662BEGIN {
574 our $VERSION = '0.03'; 663 our $VERSION = '0.15';
575 664
576 require XSLoader; 665 require XSLoader;
577 XSLoader::load (__PACKAGE__, $VERSION); 666 XSLoader::load (__PACKAGE__, $VERSION);
578 667
579 @OpenCL::Buffer::ISA = 668 @OpenCL::Buffer::ISA =

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines