ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/OpenCL/OpenCL.pm
(Generate patch)

Comparing OpenCL/OpenCL.pm (file contents):
Revision 1.10 by root, Thu Nov 17 02:10:39 2011 UTC vs.
Revision 1.19 by root, Sat Nov 19 19:54:04 2011 UTC

18vendors) - usually there is only one. 18vendors) - usually there is only one.
19 19
20Each platform gives you access to a number of OpenCL::Device objects, e.g. 20Each platform gives you access to a number of OpenCL::Device objects, e.g.
21your graphics card. 21your graphics card.
22 22
23From a platform and some devices, you create an OpenCL::Context, which is 23From a platform and some device(s), you create an OpenCL::Context, which is
24a very central object in OpenCL: Once you have a context you can create 24a very central object in OpenCL: Once you have a context you can create
25most other objects: 25most other objects:
26 26
27OpenCL::Program objects, which store source code and, after building 27OpenCL::Program objects, which store source code and, after building for a
28("compiling and linking"), also binary programs. For each kernel function 28specific device ("compiling and linking"), also binary programs. For each
29in a program you can then create an OpenCL::Kernel object which represents 29kernel function in a program you can then create an OpenCL::Kernel object
30basically a function call with argument values. 30which represents basically a function call with argument values.
31 31
32OpenCL::Memory objects of various flavours: OpenCL::Buffers objects (flat 32OpenCL::Memory objects of various flavours: OpenCL::Buffers objects (flat
33memory areas, think array) and OpenCL::Image objects (think 2d or 3d 33memory areas, think arrays or structs) and OpenCL::Image objects (think 2d
34array) for bulk data and input and output for kernels. 34or 3d array) for bulk data and input and output for kernels.
35 35
36OpenCL::Sampler objects, which are kind of like texture filter modes in 36OpenCL::Sampler objects, which are kind of like texture filter modes in
37OpenGL. 37OpenGL.
38 38
39OpenCL::Queue objects - command queues, which allow you to submit memory 39OpenCL::Queue objects - command queues, which allow you to submit memory
52 52
53OpenCL manpages: 53OpenCL manpages:
54 54
55 http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/ 55 http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/
56 56
57If you are into UML class diagrams, the following diagram might help - if
58not, it will be mildly cobfusing:
59
60 http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/classDiagram.html
61
62Here's a tutorial from AMD (very AMD-centric, too), not sure how useful it
63is, but at least it's free of charge:
64
65 http://developer.amd.com/zones/OpenCLZone/courses/Documents/Introduction_to_OpenCL_Programming%20Training_Guide%20%28201005%29.pdf
66
67And here's NVIDIA's OpenCL Best Practises Guide:
68
69 http://developer.download.nvidia.com/compute/cuda/3_2/toolkit/docs/OpenCL_Best_Practices_Guide.pdf
70
57=head1 BASIC WORKFLOW 71=head1 BASIC WORKFLOW
58 72
59To get something done, you basically have to do this once: 73To get something done, you basically have to do this once (refer to the
74examples below for actual code, this is just a high-level description):
60 75
61Find some platform (e.g. the first one) and some device (e.g. the first 76Find some platform (e.g. the first one) and some device(s) (e.g. the first
62device you can find), and create a context from those. 77device of the platform), and create a context from those.
63 78
64Create a command queue from your context, and program objects from your 79Create program objects from your OpenCL source code, then build (compile)
65OpenCL source code, build the programs. 80the programs for each device you want to run them on.
66 81
67Create kernel objects for all kernels you want to use. 82Create kernel objects for all kernels you want to use (surprisingly, these
83are not device-specific).
68 84
69Then, to execute stuff, you repeat this: 85Then, to execute stuff, you repeat these steps, possibly resuing or
86sharing some buffers:
70 87
71Create some input and output buffers from your context. Initialise the 88Create some input and output buffers from your context. Set these as
72input buffers with data. Set these as arguments to your kernel. 89arguments to your kernel.
90
91Enqueue buffer writes to initialise your input buffers (when not
92initialised at creation time).
73 93
74Enqueue the kernel execution. 94Enqueue the kernel execution.
75 95
76Enqueue buffer reads for your output buffer to read results. 96Enqueue buffer reads for your output buffer to read results.
77 97
78The next section shows how this can be done.
79
80=head1 EXAMPLES 98=head1 EXAMPLES
81 99
82=head2 Enumerate all devices and get contexts for them. 100=head2 Enumerate all devices and get contexts for them.
101
102Best run this once to get a feel for the platforms and devices in your
103system.
83 104
84 for my $platform (OpenCL::platforms) { 105 for my $platform (OpenCL::platforms) {
85 printf "platform: %s\n", $platform->info (OpenCL::PLATFORM_NAME); 106 printf "platform: %s\n", $platform->info (OpenCL::PLATFORM_NAME);
86 printf "extensions: %s\n", $platform->info (OpenCL::PLATFORM_EXTENSIONS); 107 printf "extensions: %s\n", $platform->info (OpenCL::PLATFORM_EXTENSIONS);
87 for my $device ($platform->devices) { 108 for my $device ($platform->devices) {
91 } 112 }
92 } 113 }
93 114
94=head2 Get a useful context and a command queue. 115=head2 Get a useful context and a command queue.
95 116
96 my $dev = ((OpenCL::platforms)[0]->devices)[0]; 117This is a useful boilerplate for any OpenCL program that only wants to use
97 my $ctx = $dev->context; 118one device,
98 my $queue = $ctx->queue ($dev); 119
120 my ($platform) = OpenCL::platforms; # find first platform
121 my ($dev) = $platform->devices; # find first device of platform
122 my $ctx = $platform->context (undef, [$dev]); # create context out of those
123 my $queue = $ctx->queue ($dev); # create a command queue for the device
99 124
100=head2 Print all supported image formats of a context. 125=head2 Print all supported image formats of a context.
126
127Best run this once for your context, to see whats available and how to
128gather information.
101 129
102 for my $type (OpenCL::MEM_OBJECT_IMAGE2D, OpenCL::MEM_OBJECT_IMAGE3D) { 130 for my $type (OpenCL::MEM_OBJECT_IMAGE2D, OpenCL::MEM_OBJECT_IMAGE3D) {
103 print "supported image formats for ", OpenCL::enum2str $type, "\n"; 131 print "supported image formats for ", OpenCL::enum2str $type, "\n";
104 132
105 for my $f ($ctx->supported_image_formats (0, $type)) { 133 for my $f ($ctx->supported_image_formats (0, $type)) {
124 152
125 my $src = ' 153 my $src = '
126 __kernel void 154 __kernel void
127 squareit (__global float *input, __global float *output) 155 squareit (__global float *input, __global float *output)
128 { 156 {
129 size_t id = get_global_id (0); 157 $id = get_global_id (0);
130 output [id] = input [id] * input [id]; 158 output [id] = input [id] * input [id];
131 } 159 }
132 '; 160 ';
133 161
134 my $prog = $ctx->program_with_source ($src); 162 my $prog = $ctx->program_with_source ($src);
135 163
164 # build croaks on compile errors, so catch it and print the compile errors
136 eval { $prog->build ($dev); 1 } 165 eval { $prog->build ($dev); 1 }
137 or die $prog->build_info ($dev, OpenCL::PROGRAM_BUILD_LOG); 166 or die $prog->build_info ($dev, OpenCL::PROGRAM_BUILD_LOG);
138 167
139 my $kernel = $prog->kernel ("squareit"); 168 my $kernel = $prog->kernel ("squareit");
140 169
141=head2 Create some input and output float buffers, then call squareit on them. 170=head2 Create some input and output float buffers, then call the
171'squareit' kernel on them.
142 172
143 my $input = $ctx->buffer_sv (OpenCL::MEM_COPY_HOST_PTR, pack "f*", 1, 2, 3, 4.5); 173 my $input = $ctx->buffer_sv (OpenCL::MEM_COPY_HOST_PTR, pack "f*", 1, 2, 3, 4.5);
144 my $output = $ctx->buffer (0, OpenCL::SIZEOF_FLOAT * 5); 174 my $output = $ctx->buffer (0, OpenCL::SIZEOF_FLOAT * 5);
145 175
146 # set buffer 176 # set buffer
185 215
186=head1 DOCUMENTATION 216=head1 DOCUMENTATION
187 217
188=head2 BASIC CONVENTIONS 218=head2 BASIC CONVENTIONS
189 219
190This is not a 1:1 C-style translation of OpenCL to Perl - instead I 220This is not a one-to-one C-style translation of OpenCL to Perl - instead
191attempted to make the interface as type-safe as possible and introducing 221I attempted to make the interface as type-safe as possible by introducing
192object syntax where it makes sense. There are a number of important 222object syntax where it makes sense. There are a number of important
193differences between the OpenCL C API and this module: 223differences between the OpenCL C API and this module:
194 224
195=over 4 225=over 4
196 226
197=item * Object lifetime managament is automatic - there is no need 227=item * Object lifetime managament is automatic - there is no need
198to free objects explicitly (C<clReleaseXXX>), the release function 228to free objects explicitly (C<clReleaseXXX>), the release function
199is called automatically once all Perl references to it go away. 229is called automatically once all Perl references to it go away.
200 230
201=item * OpenCL uses CamelCase for function names (C<clGetPlatformInfo>), 231=item * OpenCL uses CamelCase for function names (e.g. C<clGetPlatformIDs>, C<clGetPlatformInfo>),
202while this module uses underscores as word separator and often leaves out 232while this module uses underscores as word separator and often leaves out
203prefixes (C<< $platform->info >>). 233prefixes (C<OpenCL::platforms>, C<< $platform->info >>).
204 234
205=item * OpenCL often specifies fixed vector function arguments as short 235=item * OpenCL often specifies fixed vector function arguments as short
206arrays (C<size_t origin[3]>), while this module explicitly expects the 236arrays (C<size_t origin[3]>), while this module explicitly expects the
207components as separate arguments- 237components as separate arguments (C<$orig_x, $orig_y, $orig_z>) in
238function calls.
208 239
209=item * Where possible, one of the pitch values is calculated from the 240=item * Structures are often specified by flattening out their components
210perl scalar length and need not be specified. 241as with short vectors, and returned as arrayrefs.
211 242
212=item * When enqueuing commands, the wait list is specified by adding 243=item * When enqueuing commands, the wait list is specified by adding
213extra arguments to the function - anywhere a C<$wait_events...> argument 244extra arguments to the function - anywhere a C<$wait_events...> argument
214is documented this can be any number of event objects. 245is documented this can be any number of event objects.
215 246
245 276
246=over 4 277=over 4
247 278
248=item $int = OpenCL::errno 279=item $int = OpenCL::errno
249 280
250The last error returned by a function - it's only changed on errors. 281The last error returned by a function - it's only valid after an error occured
282and before calling another OpenCL function.
251 283
252=item $str = OpenCL::err2str $errval 284=item $str = OpenCL::err2str $errval
253 285
254Comverts an error value into a human readable string. 286Comverts an error value into a human readable string.
255 287
297 329
298Returns a list of matching OpenCL::Device objects. 330Returns a list of matching OpenCL::Device objects.
299 331
300=item $ctx = $platform->context_from_type ($properties, $type = OpenCL::DEVICE_TYPE_DEFAULT, $notify = undef) 332=item $ctx = $platform->context_from_type ($properties, $type = OpenCL::DEVICE_TYPE_DEFAULT, $notify = undef)
301 333
302Tries to create a context. Never worked for me. 334Tries to create a context. Never worked for me, and you need devices explitly anyway.
303 335
304L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContextFromType.html> 336L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContextFromType.html>
305 337
338=item $ctx = $device->context ($properties = undef, @$devices, $notify = undef)
339
340Create a new OpenCL::Context object using the given device object(s)- a
341CL_CONTEXT_PLATFORM property is supplied automatically.
342
343L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContext.html>
344
306=back 345=back
307 346
308=head2 THE OpenCL::Device CLASS 347=head2 THE OpenCL::Device CLASS
309 348
310=over 4 349=over 4
313 352
314See C<< $platform->info >> for details. 353See C<< $platform->info >> for details.
315 354
316L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetDeviceInfo.html> 355L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetDeviceInfo.html>
317 356
318=item $ctx = $device->context ($properties = undef, $notify = undef)
319
320Create a new OpenCL::Context object.
321
322L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContext.html>
323
324=back 357=back
325 358
326=head2 THE OpenCL::Context CLASS 359=head2 THE OpenCL::Context CLASS
327 360
328=over 4 361=over 4
353 386
354=item $buf = $ctx->buffer_sv ($flags, $data) 387=item $buf = $ctx->buffer_sv ($flags, $data)
355 388
356Creates a new OpenCL::Buffer object and initialise it with the given data values. 389Creates a new OpenCL::Buffer object and initialise it with the given data values.
357 390
358=item $img = $ctx->image2d ($flags, $channel_order, $channel_type, $width, $height, $data) 391=item $img = $ctx->image2d ($flags, $channel_order, $channel_type, $width, $height, $row_pitch = 0, $data = undef)
359 392
360Creates a new OpenCL::Image2D object and optionally initialises it with the given data values. 393Creates a new OpenCL::Image2D object and optionally initialises it with the given data values.
361 394
362L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateImage2D.html> 395L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateImage2D.html>
363 396
364=item $img = $ctx->image3d ($flags, $channel_order, $channel_type, $width, $height, $depth, $slice_pitch, $data) 397=item $img = $ctx->image3d ($flags, $channel_order, $channel_type, $width, $height, $depth, $row_pitch = 0, $slice_pitch = 0, $data = undef)
365 398
366Creates a new OpenCL::Image3D object and optionally initialises it with the given data values. 399Creates a new OpenCL::Image3D object and optionally initialises it with the given data values.
367 400
368L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateImage3D.html> 401L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateImage3D.html>
369 402
432 465
433=item $ev = $queue->enqueue_read_image ($src, $blocking, $x, $y, $z, $width, $height, $depth, $row_pitch, $slice_pitch, $data, $wait_events...) 466=item $ev = $queue->enqueue_read_image ($src, $blocking, $x, $y, $z, $width, $height, $depth, $row_pitch, $slice_pitch, $data, $wait_events...)
434 467
435L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueReadImage.html> 468L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueReadImage.html>
436 469
437=item $ev = $queue->enqueue_write_image ($src, $blocking, $x, $y, $z, $width, $height, $depth, $row_pitch, $data, $wait_events...) 470=item $ev = $queue->enqueue_write_image ($src, $blocking, $x, $y, $z, $width, $height, $depth, $row_pitch, $slice_pitch, $data, $wait_events...)
438 471
439L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueWriteImage.html> 472L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueWriteImage.html>
440 473
441=item $ev = $queue->enqueue_copy_buffer_rect ($src, $dst, $src_x, $src_y, $src_z, $dst_x, $dst_y, $dst_z, $width, $height, $depth, $src_row_pitch, $src_slice_pitch, 4dst_row_pitch, $dst_slice_pitch, $ait_event...) 474=item $ev = $queue->enqueue_copy_buffer_rect ($src, $dst, $src_x, $src_y, $src_z, $dst_x, $dst_y, $dst_z, $width, $height, $depth, $src_row_pitch, $src_slice_pitch, $dst_row_pitch, $dst_slice_pitch, $wait_event...)
442 475
443Yeah. 476Yeah.
444 477
445L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyBufferRect.html> 478L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyBufferRect.html>
446 479
447=item $ev = $queue->enqueue_copy_buffer_to_image (OpenCL::Buffer src, OpenCL::Image dst, size_t src_offset, size_t dst_x, size_t dst_y, size_t dst_z, size_t width, size_t height, size_t depth, ...) 480=item $ev = $queue->enqueue_copy_buffer_to_image ($src_buffer, $dst_image, $src_offset, $dst_x, $dst_y, $dst_z, $width, $height, $depth, $wait_events...)
448 481
449L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyBufferToImage.html>. 482L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyBufferToImage.html>.
450 483
451=item $ev = $queue->enqueue_copy_image (OpenCL::Image src, OpenCL::Buffer dst, size_t src_x, size_t src_y, size_t src_z, size_t dst_x, size_t dst_y, size_t dst_z, size_t width, size_t height, size_t depth, ...) 484=item $ev = $queue->enqueue_copy_image ($src_image, $dst_image, $src_x, $src_y, $src_z, $dst_x, $dst_y, $dst_z, $width, $height, $depth, $wait_events...)
452 485
453L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyImage.html> 486L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyImage.html>
454 487
455=item $ev = $queue->enqueue_copy_image_to_buffer (OpenCL::Image src, OpenCL::Buffer dst, size_t src_x, size_t src_y, size_t src_z, size_t width, size_t height, size_t depth, size_t dst_offset, ...) 488=item $ev = $queue->enqueue_copy_image_to_buffer ($src_image, $dst_image, $src_x, $src_y, $src_z, $width, $height, $depth, $dst_offset, $wait_events...)
456 489
457L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyImageToBuffer.html> 490L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyImageToBuffer.html>
458 491
459=item $ev = $queue->enqueue_task ($kernel, $wait_events...) 492=item $ev = $queue->enqueue_task ($kernel, $wait_events...)
460 493
625package OpenCL; 658package OpenCL;
626 659
627use common::sense; 660use common::sense;
628 661
629BEGIN { 662BEGIN {
630 our $VERSION = '0.03'; 663 our $VERSION = '0.15';
631 664
632 require XSLoader; 665 require XSLoader;
633 XSLoader::load (__PACKAGE__, $VERSION); 666 XSLoader::load (__PACKAGE__, $VERSION);
634 667
635 @OpenCL::Buffer::ISA = 668 @OpenCL::Buffer::ISA =

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines