[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Optimising first bwlabeln implementation
From: |
Jordi Gutiérrez Hermoso |
Subject: |
Optimising first bwlabeln implementation |
Date: |
Sun, 8 May 2011 19:59:32 -0500 |
I'm not sure if I should be discussing this in the Octave-Forge
mailing list instead, but I figure most people who read the
Octave-Forge list read these too.
Attached is my first implementation of bwlabeln for the image package.
I am freely using C++0x, so if you're using gcc you will need a
somewhat recent version (4.4 works) and pass the -std=c++0x
compilation CXXFLAG to mkoctfile. C++0x could be avoided, but I don't
see a strong reason to do so anymore.
I know this initial implementation is very dirty, but it appears to
work. It's implemented with a union-find structure per the description
of Matlab's own bwlabeln. It uses too much memory and it's slow. After
profiling with oprof, it seems evident that I am spending all the CPU
time on hashing coordinates.
I am asking for advice on how to optimise it.
A quick high-level run-down of what it currently does:
1) Check arguments (this is very poorly done right now)
2) Creates an std::set of neighbours to check based on the
connectivity mask that the user passes, making sure to not
include redundant neighbours. The connectivity mask is assumed
to be symmetric for this purpose, although this is not
currently checked.
3) Iterates with a linear index through all of the voxels in
boolNDArray that represents the image
4) For each voxel, first find it with union-find so that it gets
assigned to a set of the union-find partition
5) Check each neighbour in the std::set obtained from the
connectivity mask, performing a union with the current voxel
if necessary.
6) Run again through the union-find structure again resetting
labels to be sequential and start from 1.
7) Output the labelled NDArray and the number of labels.
The union-find structure is implemented, perhaps in much greater
generality than necessary, in the union-find.h++ template header.
Notice I'm using a C++0x unordered_map which is doing lots and lots of
hashing. The first obvious optimisation is to not use coordinates but
stick to linear indices, but then this has the problem that figuring
out when a voxel is at a boundary when checking its neighbours becomes
more complicated.
I have another general question, about how to properly use Octave
classes. You'll notice I wrote a bunch of silly operators for
Array<octave_idx_type> because I couldn't find a way to manipulate
coordinates otherwise. This point will become moot once I modify the
code to not use coordinates anymore or at least not as directly, but
just out of curiousity, if I did want to do computations with the
coordinates, how should I have done it?
Thanks in advance, and I hope this is interesting.
- Jordi G. H.
bwlabeln.cc
Description: Text Data
union-find.h++
Description: Text Data
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Optimising first bwlabeln implementation,
Jordi Gutiérrez Hermoso <=