Wrapping C with Python: 3D image segmentation with region growing

chestervonwinch | 85 points

People interested in mixing image processing, Python, and C code for high performance might also enjoy tinkering with a combination of Numpy and PyOpenCL. It gives you some powerful mechanisms to manipulate n-dimensional arrays and then offload some brute-force work to your GPU or multi-core CPU.

OpenCL is comparable to CUDA. It's essentially a C dialect with a lot of overlap to the OpenGL GLSL (shader language), with intrinsics for certain SIMD operations.

You write your outer data wrangling code in Python, and put your little OpenCL kernels into the program as multi-line strings, which get compiled at runtime into appropriate parallel processing routines which are dispatched by whichever OpenCL drivers you install. I've used my NVIDIA GPUs and Intel multi-core/SIMD CPUs to good effect.

This kind of parallel processing leads to turning your mind inside-out a little and using signal-processing techniques. You want think in terms of cooperative algorithms you can perform out of a large number of independent, localized operations rather than a single point of focus which sequentially wanders around a buffer.

saltcured | 7 years ago

Why was the stack implemented as a linked list? Could this be turned into a block allocated array to improve cache locality (get rid of an entire int). 64%8 = 0 so you won't have any alignment issues. You'd also avoid doing free() on every loop.

On a more general note if you're using the stack to queue up subsequent computation why not just opt for tail-recursion which will be optimized out?

Also why are you using f2py rather than just writing a C module? [0]

[0] - https://csl.name/post/c-functions-python/

gravypod | 7 years ago