gnuastro-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gnuastro-commits] master 14651f9: Segment: final labels are now thread-


From: Mohammad Akhlaghi
Subject: [gnuastro-commits] master 14651f9: Segment: final labels are now thread-safe and reproducible
Date: Wed, 16 Sep 2020 13:29:33 -0400 (EDT)

branch: master
commit 14651f94c7b3a3cf1c57b63e8c1b92e6c93ececd
Author: Mohammad Akhlaghi <mohammad@akhlaghi.org>
Commit: Mohammad Akhlaghi <mohammad@akhlaghi.org>

    Segment: final labels are now thread-safe and reproducible
    
    Until now, the final labeled images of Segment were not thread-safe, hence
    when run on multiple threads, the resulting labels were not reproducible:
    the same clumps/objects would get different labels in each run! This is
    beacause Segment finds the clumps and objects over each detection in
    parallel and there is no way to guarantee which thread is run at which
    moment, without drastically decreasing the efficiency of threads!
    
    With this commit, the problem is fixed with a simple strategy: after the
    final labels are found, all the object labels (or clumps if Segment is run
    in '--onlyclumps' mode) are reset based on their position in the image. In
    short, we just parse the pixels from the bottom-left to the top-right give
    new labels in this order. This ensures the final output labels are the same
    in any run.
    
    This bug was reported by Joanna Sakowska.
    
    This fixes bug #59017.
---
 NEWS                         |  2 ++
 bin/segment/segment.c        | 55 ++++++++++++++++++++++++++++++++++++++++++++
 doc/announce-acknowledge.txt |  4 ++++
 3 files changed, 61 insertions(+)

diff --git a/NEWS b/NEWS
index c509137..c42f598 100644
--- a/NEWS
+++ b/NEWS
@@ -59,6 +59,7 @@ See the end of the file for license conditions.
 
 ** Bugs fixed
   bug #59105: Column arithmetic operator degree-to-ra, returning to dec
+  bug #59017: Segment's object IDs are not thread-safe (i.e., reproducible).
 
 
 
@@ -213,6 +214,7 @@ See the end of the file for license conditions.
 
 
 
+
 * Noteworthy changes in release 0.12 (library 10.0.0) (2020-05-20) [stable]
 
 ** New features
diff --git a/bin/segment/segment.c b/bin/segment/segment.c
index bd89d55..bbfcfa4 100644
--- a/bin/segment/segment.c
+++ b/bin/segment/segment.c
@@ -824,6 +824,54 @@ segment_save_sn_table(struct clumps_params *clprm)
 
 
 
+/* Avoid non-reproducible labels (when built on multiple threads). Note
+   that when working with objects, the clump labels don't need to be
+   re-labeled (they always start from 1 within each object and are thus
+   already thread-safe).*/
+static void
+segment_reproducible_labels(struct segmentparams *p)
+{
+  size_t i;
+  gal_data_t *new;
+  int32_t currentlab=0, *oldarr, *newarr, *newlabs;
+  gal_data_t *old = p->onlyclumps ? p->clabel : p->olabel;
+  size_t numlabsplus1 = (p->onlyclumps ? p->numclumps : p->numobjects) + 1;
+
+  /* Allocate the necessary datasets. */
+  new=gal_data_alloc(NULL, old->type, old->ndim, old->dsize, old->wcs, 0,
+                     p->cp.minmapsize, p->cp.quietmmap, old->name, old->unit,
+                     old->comment);
+  newlabs=gal_pointer_allocate(old->type, numlabsplus1, 0, __func__,
+                               "newlabs");
+
+  /* Initialize the newlabs array to blank (so we don't relabel
+     things). */
+  for(i=0;i<numlabsplus1;++i) newlabs[i]=GAL_BLANK_INT32;
+
+  /* Parse the old dataset and set the new labels. */
+  oldarr=old->array;
+  for(i=0;i<old->size;++i)
+    if( oldarr[i] > 0 && newlabs[ oldarr[i] ]==GAL_BLANK_INT32 )
+      newlabs[ oldarr[i] ] = ++currentlab;
+
+  /* For a check.
+  for(i=0;i<numlabsplus1;++i) printf("%zu --> %d\n", i, newlabs[i]);
+  */
+
+  /* Fill the newly labeled dataset. */
+  newarr=new->array;
+  for(i=0;i<old->size;++i)
+    newarr[i] = oldarr[i]>0 ? newlabs[ oldarr[i] ] : oldarr[i];
+
+  /* Clean up. */
+  free(newlabs);
+  if(p->onlyclumps) { gal_data_free(p->clabel); p->clabel=new; }
+  else              { gal_data_free(p->olabel); p->olabel=new; }
+}
+
+
+
+
 /* Find true clumps over the detected regions. */
 static void
 segment_detections(struct segmentparams *p)
@@ -1028,6 +1076,13 @@ segment_detections(struct segmentparams *p)
   p->numobjects=clprm.totobjects;
 
 
+  /* Correct the final object labels to start from the bottom of the
+     image. This is necessary because we define objects on multiple
+     threads, so every time a program is run, an object can have a
+     different label! */
+  segment_reproducible_labels(p);
+
+
   /* Clean up allocated structures and destroy the mutex. */
   gal_data_array_free(clprm.sn, p->numdetections+1, 1);
   gal_data_array_free(labindexs, p->numdetections+1, 1);
diff --git a/doc/announce-acknowledge.txt b/doc/announce-acknowledge.txt
index eba44ad..6221717 100644
--- a/doc/announce-acknowledge.txt
+++ b/doc/announce-acknowledge.txt
@@ -2,9 +2,13 @@ Alphabetically ordered list to acknowledge in the next release.
 
 Sebastian Luna Valero
 Samane Raji
+Joanna Sakowska
 Sachin Kumar Singh
 
 
+
+
+
 Copyright (C) 2015-2020 Free Software Foundation, Inc.
 
 Permission is granted to copy, distribute and/or modify this document under



reply via email to

[Prev in Thread] Current Thread [Next in Thread]