ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/OpenCL/OpenCL.pm
(Generate patch)

Comparing OpenCL/OpenCL.pm (file contents):
Revision 1.8 by root, Wed Nov 16 06:22:20 2011 UTC vs.
Revision 1.16 by root, Thu Nov 17 03:42:03 2011 UTC

8 8
9=head1 DESCRIPTION 9=head1 DESCRIPTION
10 10
11This is an early release which might be useful, but hasn't seen much testing. 11This is an early release which might be useful, but hasn't seen much testing.
12 12
13=head2 OpenCL FROM 10000 FEET HEIGHT
14
15Here is a high level overview of OpenCL:
16
17First you need to find one or more OpenCL::Platforms (kind of like
18vendors) - usually there is only one.
19
20Each platform gives you access to a number of OpenCL::Device objects, e.g.
21your graphics card.
22
23From a platform and some device(s), you create an OpenCL::Context, which is
24a very central object in OpenCL: Once you have a context you can create
25most other objects:
26
27OpenCL::Program objects, which store source code and, after building for a
28specific device ("compiling and linking"), also binary programs. For each
29kernel function in a program you can then create an OpenCL::Kernel object
30which represents basically a function call with argument values.
31
32OpenCL::Memory objects of various flavours: OpenCL::Buffers objects (flat
33memory areas, think arrays or structs) and OpenCL::Image objects (think 2d
34or 3d array) for bulk data and input and output for kernels.
35
36OpenCL::Sampler objects, which are kind of like texture filter modes in
37OpenGL.
38
39OpenCL::Queue objects - command queues, which allow you to submit memory
40reads, writes and copies, as well as kernel calls to your devices. They
41also offer a variety of methods to synchronise request execution, for
42example with barriers or OpenCL::Event objects.
43
44OpenCL::Event objects are used to signal when something is complete.
45
13=head1 HELPFUL RESOURCES 46=head2 HELPFUL RESOURCES
14 47
15The OpenCL spec used to develop this module (1.2 spec was available, but 48The OpenCL spec used to develop this module (1.2 spec was available, but
16no implementation was available to me :). 49no implementation was available to me :).
17 50
18 http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf 51 http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf
19 52
20OpenCL manpages: 53OpenCL manpages:
21 54
22 http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/ 55 http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/
23 56
57Here's a tutorial from AMD (very AMD-centric, too), not sure how useful it
58is, but at least it's free of charge:
59
60 http://developer.amd.com/zones/OpenCLZone/courses/Documents/Introduction_to_OpenCL_Programming%20Training_Guide%20%28201005%29.pdf
61
62If you are into UML class diagrams, the following diagram might help - if
63not, it will be mildly cofusing:
64
65 http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/classDiagram.html
66
67=head1 BASIC WORKFLOW
68
69To get something done, you basically have to do this once (refer to the
70examples below for actual code, this is just a high-level description):
71
72Find some platform (e.g. the first one) and some device(s) (e.g. the first
73device of the platform), and create a context from those.
74
75Create program objects from your OpenCL source code, then build (compile)
76the programs for each device you want to run them on.
77
78Create kernel objects for all kernels you want to use (surprisingly, these
79are not device-specific).
80
81Then, to execute stuff, you repeat these steps, possibly resuing or
82sharing some buffers:
83
84Create some input and output buffers from your context. Set these as
85arguments to your kernel.
86
87Enqueue buffer writes to initialise your input buffers (when not
88initialised at creation time).
89
90Enqueue the kernel execution.
91
92Enqueue buffer reads for your output buffer to read results.
93
24=head1 EXAMPLES 94=head1 EXAMPLES
25 95
26=head2 Enumerate all devices and get contexts for them. 96=head2 Enumerate all devices and get contexts for them.
27 97
98Best run this once to get a feel for the platforms and devices in your
99system.
100
28 for my $platform (OpenCL::platforms) { 101 for my $platform (OpenCL::platforms) {
29 warn $platform->info (OpenCL::PLATFORM_NAME); 102 printf "platform: %s\n", $platform->info (OpenCL::PLATFORM_NAME);
30 warn $platform->info (OpenCL::PLATFORM_EXTENSIONS); 103 printf "extensions: %s\n", $platform->info (OpenCL::PLATFORM_EXTENSIONS);
31 for my $device ($platform->devices) { 104 for my $device ($platform->devices) {
32 warn $device->info (OpenCL::DEVICE_NAME); 105 printf "+ device: %s\n", $device->info (OpenCL::DEVICE_NAME);
33 my $ctx = $device->context_simple; 106 my $ctx = $device->context;
34 # do stuff 107 # do stuff
35 } 108 }
36 } 109 }
37 110
38=head2 Get a useful context and a command queue. 111=head2 Get a useful context and a command queue.
39 112
40 my $dev = ((OpenCL::platforms)[0]->devices)[0]; 113This is a useful boilerplate for any OpenCL program that only wants to use
41 my $ctx = $dev->context_simple; 114one device,
42 my $queue = $ctx->command_queue_simple ($dev); 115
116 my ($platform) = OpenCL::platforms; # find first platform
117 my ($dev) = $platform->devices; # find first device of platform
118 my $ctx = $platform->context (undef, [$dev]); # create context out of those
119 my $queue = $ctx->queue ($dev); # create a command queue for the device
43 120
44=head2 Print all supported image formats of a context. 121=head2 Print all supported image formats of a context.
45 122
123Best run this once for your context, to see whats available and how to
124gather information.
125
46 for my $type (OpenCL::MEM_OBJECT_IMAGE2D, OpenCL::MEM_OBJECT_IMAGE3D) { 126 for my $type (OpenCL::MEM_OBJECT_IMAGE2D, OpenCL::MEM_OBJECT_IMAGE3D) {
47 say "supported image formats for ", OpenCL::enum2str $type; 127 print "supported image formats for ", OpenCL::enum2str $type, "\n";
48 128
49 for my $f ($ctx->supported_image_formats (0, $type)) { 129 for my $f ($ctx->supported_image_formats (0, $type)) {
50 printf " %-10s %-20s\n", OpenCL::enum2str $f->[0], OpenCL::enum2str $f->[1]; 130 printf " %-10s %-20s\n", OpenCL::enum2str $f->[0], OpenCL::enum2str $f->[1];
51 } 131 }
52 } 132 }
55then asynchronously. 135then asynchronously.
56 136
57 my $buf = $ctx->buffer_sv (OpenCL::MEM_COPY_HOST_PTR, "helmut"); 137 my $buf = $ctx->buffer_sv (OpenCL::MEM_COPY_HOST_PTR, "helmut");
58 138
59 $queue->enqueue_read_buffer ($buf, 1, 1, 3, my $data); 139 $queue->enqueue_read_buffer ($buf, 1, 1, 3, my $data);
60 warn $data; 140 print "$data\n";
61 141
62 my $ev = $queue->enqueue_read_buffer ($buf, 0, 1, 3, my $data); 142 my $ev = $queue->enqueue_read_buffer ($buf, 0, 1, 3, my $data);
63 $ev->wait; 143 $ev->wait;
64 warn $data; 144 print "$data\n"; # prints "elm"
65 145
66=head2 Create and build a program, then create a kernel out of one of its 146=head2 Create and build a program, then create a kernel out of one of its
67functions. 147functions.
68 148
69 my $src = ' 149 my $src = '
70 __kernel void 150 __kernel void
71 squareit (__global float *input, __global float *output) 151 squareit (__global float *input, __global float *output)
72 { 152 {
73 size_t id = get_global_id (0); 153 $id = get_global_id (0);
74 output [id] = input [id] * input [id]; 154 output [id] = input [id] * input [id];
75 } 155 }
76 '; 156 ';
77 157
78 my $prog = $ctx->program_with_source ($src); 158 my $prog = $ctx->program_with_source ($src);
79 159
160 # build croaks on compile errors, so catch it and print the compile errors
80 eval { $prog->build ($dev); 1 } 161 eval { $prog->build ($dev); 1 }
81 or die $prog->build_info ($dev, OpenCL::PROGRAM_BUILD_LOG); 162 or die $prog->build_info ($dev, OpenCL::PROGRAM_BUILD_LOG);
82 163
83 my $kernel = $prog->kernel ("squareit"); 164 my $kernel = $prog->kernel ("squareit");
84 165
85=head2 Create some input and output float buffers, then call squareit on them. 166=head2 Create some input and output float buffers, then call the
167'squareit' kernel on them.
86 168
87 my $input = $ctx->buffer_sv (OpenCL::MEM_COPY_HOST_PTR, pack "f*", 1, 2, 3, 4.5); 169 my $input = $ctx->buffer_sv (OpenCL::MEM_COPY_HOST_PTR, pack "f*", 1, 2, 3, 4.5);
88 my $output = $ctx->buffer (0, OpenCL::SIZEOF_FLOAT * 5); 170 my $output = $ctx->buffer (0, OpenCL::SIZEOF_FLOAT * 5);
89 171
90 # set buffer 172 # set buffer
96 178
97 # enqueue a synchronous read 179 # enqueue a synchronous read
98 $queue->enqueue_read_buffer ($output, 1, 0, OpenCL::SIZEOF_FLOAT * 4, my $data); 180 $queue->enqueue_read_buffer ($output, 1, 0, OpenCL::SIZEOF_FLOAT * 4, my $data);
99 181
100 # print the results: 182 # print the results:
101 say join ", ", unpack "f*", $data; 183 printf "%s\n", join ", ", unpack "f*", $data;
102 184
103=head2 The same enqueue operations as before, but assuming an out-of-order queue, 185=head2 The same enqueue operations as before, but assuming an out-of-order queue,
104showing off barriers. 186showing off barriers.
105 187
106 # execute it for all 4 numbers 188 # execute it for all 4 numbers
129 211
130=head1 DOCUMENTATION 212=head1 DOCUMENTATION
131 213
132=head2 BASIC CONVENTIONS 214=head2 BASIC CONVENTIONS
133 215
134This is not a 1:1 C-style translation of OpenCL to Perl - instead I 216This is not a one-to-one C-style translation of OpenCL to Perl - instead
135attempted to make the interface as type-safe as possible and introducing 217I attempted to make the interface as type-safe as possible by introducing
136object syntax where it makes sense. There are a number of important 218object syntax where it makes sense. There are a number of important
137differences between the OpenCL C API and this module: 219differences between the OpenCL C API and this module:
138 220
139=over 4 221=over 4
140 222
145=item * OpenCL uses CamelCase for function names (C<clGetPlatformInfo>), 227=item * OpenCL uses CamelCase for function names (C<clGetPlatformInfo>),
146while this module uses underscores as word separator and often leaves out 228while this module uses underscores as word separator and often leaves out
147prefixes (C<< $platform->info >>). 229prefixes (C<< $platform->info >>).
148 230
149=item * OpenCL often specifies fixed vector function arguments as short 231=item * OpenCL often specifies fixed vector function arguments as short
150arrays (C<size_t origin[3]>), while this module explicitly expects the 232arrays (C<$origin[3]>), while this module explicitly expects the
151components as separate arguments- 233components as separate arguments-
152 234
235=item * Structures are often specified with their components, and returned
236as arrayrefs.
237
153=item * Where possible, the row_pitch value is calculated from the perl 238=item * Where possible, one of the pitch values is calculated from the
154scalar length and need not be specified. 239perl scalar length and need not be specified.
155 240
156=item * When enqueuing commands, the wait list is specified by adding 241=item * When enqueuing commands, the wait list is specified by adding
157extra arguments to the function - everywhere a C<$wait_events...> argument 242extra arguments to the function - anywhere a C<$wait_events...> argument
158is documented this can be any number of event objects. 243is documented this can be any number of event objects.
159 244
160=item * When enqueuing commands, if the enqueue method is called in void 245=item * When enqueuing commands, if the enqueue method is called in void
161context, no event is created. In all other contexts an event is returned 246context, no event is created. In all other contexts an event is returned
162by the method. 247by the method.
189 274
190=over 4 275=over 4
191 276
192=item $int = OpenCL::errno 277=item $int = OpenCL::errno
193 278
194The last error returned by a function - it's only changed on errors. 279The last error returned by a function - it's only valid after an error occured
280and before calling another OpenCL function.
195 281
196=item $str = OpenCL::err2str $errval 282=item $str = OpenCL::err2str $errval
197 283
198Comverts an error value into a human readable string. 284Comverts an error value into a human readable string.
199 285
200=item $str = OpenCL::err2str $enum 286=item $str = OpenCL::enum2str $enum
201 287
202Converts most enum values (inof parameter names, image format constants, 288Converts most enum values (inof parameter names, image format constants,
203object types, addressing and filter modes, command types etc.) into a 289object types, addressing and filter modes, command types etc.) into a
204human readbale string. When confronted with some random integer it can be 290human readbale string. When confronted with some random integer it can be
205very helpful to pass it through this function to maybe get some readable 291very helpful to pass it through this function to maybe get some readable
209 295
210Returns all available OpenCL::Platform objects. 296Returns all available OpenCL::Platform objects.
211 297
212L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetPlatformIDs.html> 298L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetPlatformIDs.html>
213 299
214=item $ctx = OpenCL::context_from_type_simple $type = OpenCL::DEVICE_TYPE_DEFAULT 300=item $ctx = OpenCL::context_from_type $properties, $type = OpenCL::DEVICE_TYPE_DEFAULT, $notify = undef
215 301
216Tries to create a context from a default device and platform - never worked for me. 302Tries to create a context from a default device and platform - never worked for me.
217 303
218L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContextFromType.html> 304L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContextFromType.html>
219 305
239 325
240=item @devices = $platform->devices ($type = OpenCL::DEVICE_TYPE_ALL) 326=item @devices = $platform->devices ($type = OpenCL::DEVICE_TYPE_ALL)
241 327
242Returns a list of matching OpenCL::Device objects. 328Returns a list of matching OpenCL::Device objects.
243 329
244=item $ctx = $platform->context_from_type_simple ($type = OpenCL::DEVICE_TYPE_DEFAULT) 330=item $ctx = $platform->context_from_type ($properties, $type = OpenCL::DEVICE_TYPE_DEFAULT, $notify = undef)
245 331
246Tries to create a context. Never worked for me. 332Tries to create a context. Never worked for me, and you need devices explitly anyway.
247 333
248L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContextFromType.html> 334L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContextFromType.html>
249 335
336=item $ctx = $device->context ($properties = undef, @$devices, $notify = undef)
337
338Create a new OpenCL::Context object using the given device object(s)- a
339CL_CONTEXT_PLATFORM property is supplied automatically.
340
341L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContext.html>
342
250=back 343=back
251 344
252=head2 THE OpenCL::Device CLASS 345=head2 THE OpenCL::Device CLASS
253 346
254=over 4 347=over 4
257 350
258See C<< $platform->info >> for details. 351See C<< $platform->info >> for details.
259 352
260L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetDeviceInfo.html> 353L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetDeviceInfo.html>
261 354
262=item $ctx = $device->context_simple
263
264Convenience function to create a new OpenCL::Context object.
265
266L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateContext.html>
267
268=back 355=back
269 356
270=head2 THE OpenCL::Context CLASS 357=head2 THE OpenCL::Context CLASS
271 358
272=over 4 359=over 4
275 362
276See C<< $platform->info >> for details. 363See C<< $platform->info >> for details.
277 364
278L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetContextInfo.html> 365L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clGetContextInfo.html>
279 366
280=item $queue = $ctx->command_queue_simple ($device) 367=item $queue = $ctx->queue ($device, $properties)
281 368
282Convenience function to create a new OpenCL::Queue object from the context and the given device. 369Create a new OpenCL::Queue object from the context and the given device.
283 370
284L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateCommandQueue.html> 371L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateCommandQueue.html>
285 372
286=item $ev = $ctx->user_event 373=item $ev = $ctx->user_event
287 374
380 467
381=item $ev = $queue->enqueue_write_image ($src, $blocking, $x, $y, $z, $width, $height, $depth, $row_pitch, $data, $wait_events...) 468=item $ev = $queue->enqueue_write_image ($src, $blocking, $x, $y, $z, $width, $height, $depth, $row_pitch, $data, $wait_events...)
382 469
383L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueWriteImage.html> 470L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueWriteImage.html>
384 471
385=item $ev = $queue->enqueue_copy_buffer_rect ($src, $dst, $src_x, $src_y, $src_z, $dst_x, $dst_y, $dst_z, $width, $height, $depth, $src_row_pitch, $src_slice_pitch, 4dst_row_pitch, $dst_slice_pitch, $ait_event...) 472=item $ev = $queue->enqueue_copy_buffer_rect ($src, $dst, $src_x, $src_y, $src_z, $dst_x, $dst_y, $dst_z, $width, $height, $depth, $src_row_pitch, $src_slice_pitch, 4dst_row_pitch, $dst_slice_pitch, $wait_event...)
386 473
387Yeah. 474Yeah.
388 475
389L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyBufferRect.html> 476L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyBufferRect.html>
390 477
391=item $ev = $queue->enqueue_copy_buffer_to_image (OpenCL::Buffer src, OpenCL::Image dst, size_t src_offset, size_t dst_x, size_t dst_y, size_t dst_z, size_t width, size_t height, size_t depth, ...) 478=item $ev = $queue->enqueue_copy_buffer_to_image ($src_buffer, $dst_image, $src_offset, $dst_x, $dst_y, $dst_z, $width, $height, $depth, $wait_events...)
392 479
393L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyBufferToImage.html>. 480L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyBufferToImage.html>.
394 481
395=item $ev = $queue->enqueue_copy_image (OpenCL::Image src, OpenCL::Buffer dst, size_t src_x, size_t src_y, size_t src_z, size_t dst_x, size_t dst_y, size_t dst_z, size_t width, size_t height, size_t depth, ...) 482=item $ev = $queue->enqueue_copy_image ($src_image, $dst_image, $src_x, $src_y, $src_z, $dst_x, $dst_y, $dst_z, $width, $height, $depth, $wait_events...)
396 483
397L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyImage.html> 484L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyImage.html>
398 485
399=item $ev = $queue->enqueue_copy_image_to_buffer (OpenCL::Image src, OpenCL::Buffer dst, size_t src_x, size_t src_y, size_t src_z, size_t width, size_t height, size_t depth, size_t dst_offset, ...) 486=item $ev = $queue->enqueue_copy_image_to_buffer ($src_image, $dst_image, $src_x, $src_y, $src_z, $width, $height, $depth, $dst_offset, $wait_events...)
400 487
401L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyImageToBuffer.html> 488L<http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clEnqueueCopyImageToBuffer.html>
402 489
403=item $ev = $queue->enqueue_task ($kernel, $wait_events...) 490=item $ev = $queue->enqueue_task ($kernel, $wait_events...)
404 491
569package OpenCL; 656package OpenCL;
570 657
571use common::sense; 658use common::sense;
572 659
573BEGIN { 660BEGIN {
574 our $VERSION = '0.03'; 661 our $VERSION = '0.14';
575 662
576 require XSLoader; 663 require XSLoader;
577 XSLoader::load (__PACKAGE__, $VERSION); 664 XSLoader::load (__PACKAGE__, $VERSION);
578 665
579 @OpenCL::Buffer::ISA = 666 @OpenCL::Buffer::ISA =

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines