bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] core-count: A new program to count the number of cpu cores


From: Bruno Haible
Subject: Re: [PATCH] core-count: A new program to count the number of cpu cores
Date: Sun, 1 Nov 2009 14:58:42 +0100
User-agent: KMail/1.9.9

Pádraig Brady wrote:
> num_processors() already uses _NPROCESSORS_ONLN (online processors)
> so I then wondered how this be different to that returned by
> pthread_getaffinity_np() ?
> 
> A quick google for cpuset shows:
> http://www.kernel.org/doc/man-pages/online/pages/man7/cpuset.7.html
> 
> Also this is what sysconf seems to query for the variables above:
> $ strace -e open getconf _NPROCESSORS_ONLN
> open("/proc/stat"
> $ strace -e open getconf _NPROCESSORS_CONF
> open("/sys/devices/system/cpu"
> 
> So looking at the /proc/stat code:
> http://lxr.linux.no/#linux+v2.6.31/fs/proc/stat.c
> Shows it calls for_each_online_cpu()
> Which according to the following is each CPU available to scheduler:
> http://lxr.linux.no/#linux+v2.6.31/include/linux/cpumask.h#L451
> However that's system wide and a particular process
> could be in a smaller cpuset.
> 
> pthread_getaffinity_np instead calls sched_getaffinity which
> can return a smaller set as seen here:
> http://lxr.linux.no/#linux+v2.6.31/kernel/sched.c#L6484

Thanks for presenting these investigations.

> I do wonder though whether it would be better
> to have num_processors() try to return this by default?

Certainly, yes. The implementation of omp_get_num_threads() in
GCC's libgomp does the same thing.

> Also I'm wondering why you used the pthread interface to this?
> I didn't notice pthread_getaffinity_np() in POSIX for example
> (is that what the _np represents?), so why not call sched_getaffinity
> directly without needing to link with the pthread library.
> From experience the sched_getaffinity() call has been a moving target:
> http://www.pixelbeat.org/programming/gcc/c_c++_notes.html#affinity
> but it has been stable for a long time and one could just check
> for the current stable interface.

Good point. Additionally, NetBSD 5 also has a pthread_getaffinity_np
function, but with a different API! (cpu_set_t vs. cpuset_t.) On that
platform, it's based on sched_getaffinity_np() which also has a
different API than sched_getaffinity() in glibc. But at least it's
a different function name.

> Right. So in that case I would push the sched_getaffinity()
> down into num_processors in gnulib.

Yes, and with the same argumentation the check of the environment
variable OMP_NUM_THREADS (which I don't see in Giuseppe's patch)
belongs here as well.

Here is a proposed change to the gnulib 'nproc' module. It will
require changes (simplification) on Giuseppe's side, of course.


2009-11-01  Bruno Haible  <address@hidden>

        Make num_processors more flexible and consistent.
        * lib/nproc.h (enum nproc_query): New type.
        (num_processors): Add a 'query' argument.
        * lib/nproc.c: Include <stdlib.h>, <sched.h>, c-ctype.h.
        (num_processors): Add a 'query' argument. Test the value of the
        OMP_NUM_THREADS environment variable if requested. On Linux, NetBSD,
        mingw, count the number of CPUs available for the current process.
        * m4/nproc.m4 (gl_PREREQ_NPROC): Require AC_USE_SYSTEM_EXTENSIONS.
        Check for sched_getaffinity and sched_getaffinity_np.
        * modules/nproc (Depends-on): Add c-ctype, extensions.

*** NEWS.orig   2009-11-01 14:55:37.000000000 +0100
--- NEWS        2009-11-01 14:20:47.000000000 +0100
***************
*** 6,11 ****
--- 6,13 ----
  
  Date        Modules         Changes
  
+ 2009-11-01  nproc           The num_processors function now takes an argument.
+ 
  2009-10-10  utimens         The use of this module now requires linking with
                              $(LIB_CLOCK_GETTIME).
  
*** lib/nproc.h.orig    2009-11-01 14:55:37.000000000 +0100
--- lib/nproc.h 2009-11-01 14:20:57.000000000 +0100
***************
*** 16,29 ****
     along with this program; if not, write to the Free Software Foundation,
     Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.  */
  
! /* Written by Glen Lenker.  */
  
  /* Allow the use in C++ code.  */
  #ifdef __cplusplus
  extern "C" {
  #endif
  
! unsigned long int num_processors (void);
  
  #ifdef __cplusplus
  }
--- 16,46 ----
     along with this program; if not, write to the Free Software Foundation,
     Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.  */
  
! /* Written by Glen Lenker and Bruno Haible.  */
  
  /* Allow the use in C++ code.  */
  #ifdef __cplusplus
  extern "C" {
  #endif
  
! /* A "processor" in this context means a thread execution unit, that is either
!    - an execution core in a (possibly multi-core) chip, in a (possibly multi-
!      chip) module, in a single computer, or
!    - a thread execution unit inside a core
!      (hyper-threading, see <http://en.wikipedia.org/wiki/Hyper-threading>).
!    Which of the two definitions is used, is unspecified.  */
! 
! enum nproc_query
! {
!   NPROC_ALL,                 /* total number of processors */
!   NPROC_CURRENT,             /* processors available to the current process */
!   NPROC_CURRENT_OVERRIDABLE  /* likewise, but overridable through the
!                                 OMP_NUM_THREADS environment variable */
! };
! 
! /* Return the total number of processors.  The result is guaranteed to
!    be at least 1.  */
! extern unsigned long int num_processors (enum nproc_query query);
  
  #ifdef __cplusplus
  }
*** lib/nproc.c.orig    2009-11-01 14:55:37.000000000 +0100
--- lib/nproc.c 2009-11-01 14:54:52.000000000 +0100
***************
*** 16,28 ****
     along with this program; if not, write to the Free Software Foundation,
     Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.  */
  
! /* Written by Glen Lenker.  */
  
  #include <config.h>
  #include "nproc.h"
  
  #include <unistd.h>
  
  #include <sys/types.h>
  
  #if HAVE_SYS_PSTAT_H
--- 16,37 ----
     along with this program; if not, write to the Free Software Foundation,
     Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.  */
  
! /* Written by Glen Lenker and Bruno Haible.  */
  
  #include <config.h>
  #include "nproc.h"
  
+ #include <stdlib.h>
  #include <unistd.h>
  
+ #if HAVE_PTHREAD_AFFINITY_NP && 0
+ # include <pthread.h>
+ # include <sched.h>
+ #endif
+ #if HAVE_SCHED_GETAFFINITY || HAVE_SCHED_GETAFFINITY_NP
+ # include <sched.h>
+ #endif
+ 
  #include <sys/types.h>
  
  #if HAVE_SYS_PSTAT_H
***************
*** 46,73 ****
  # include <windows.h>
  #endif
  
  #define ARRAY_SIZE(a) (sizeof (a) / sizeof ((a)[0]))
  
- /* Return the total number of processors.  The result is guaranteed to
-    be at least 1.  */
  unsigned long int
! num_processors (void)
  {
  #if defined _SC_NPROCESSORS_ONLN
!   { /* This works on glibc, MacOS X 10.5, FreeBSD, AIX, OSF/1, Solaris, 
Cygwin,
!        Haiku.  */
!     long int nprocs = sysconf (_SC_NPROCESSORS_ONLN);
!     if (0 < nprocs)
!       return nprocs;
!   }
  #endif
  
  #if HAVE_PSTAT_GETDYNAMIC
    { /* This works on HP-UX.  */
      struct pst_dynamic psd;
!     if (0 <= pstat_getdynamic (&psd, sizeof psd, 1, 0)
!       && 0 < psd.psd_proc_cnt)
!       return psd.psd_proc_cnt;
    }
  #endif
  
--- 55,269 ----
  # include <windows.h>
  #endif
  
+ #include "c-ctype.h"
+ 
  #define ARRAY_SIZE(a) (sizeof (a) / sizeof ((a)[0]))
  
  unsigned long int
! num_processors (enum nproc_query query)
  {
+   if (query == NPROC_CURRENT_OVERRIDABLE)
+     {
+       /* Test the environment variable OMP_NUM_THREADS, recognized also by all
+        programs that are based on OpenMP.  The OpenMP spec says that the
+        value assigned to the environment variable "may have leading and
+        trailing white space". */
+       const char *envvalue = getenv ("OMP_NUM_THREADS");
+ 
+       if (envvalue != NULL)
+       {
+         while (*envvalue != '\0' && c_isspace (*envvalue))
+           envvalue++;
+         /* Convert it from decimal to 'unsigned long'.  */
+         if (c_isdigit (*envvalue))
+           {
+             char *endptr = NULL;
+             unsigned long int value = strtoul (envvalue, &endptr, 10);
+ 
+             if (endptr != NULL)
+               {
+                 while (*endptr != '\0' && c_isspace (*endptr))
+                   endptr++;
+                 if (*endptr == '\0')
+                   return (value > 0 ? value : 1);
+               }
+           }
+       }
+ 
+       query = NPROC_CURRENT;
+     }
+   /* Here query is one of NPROC_ALL, NPROC_CURRENT.  */
+ 
+   if (query == NPROC_CURRENT)
+     {
+       /* glibc >= 2.3.3 with NPTL and NetBSD 5 have pthread_getaffinity_np,
+        but with different APIs.  Also it requires linking with -lpthread.
+        Therefore this code is not enabled.
+        glibc >= 2.3.4 has sched_getaffinity whereas NetBSD 5 has
+        sched_getaffinity_np.  */
+ #if HAVE_PTHREAD_AFFINITY_NP && defined __GLIBC__ && 0
+       {
+       cpu_set_t set;
+ 
+       if (pthread_getaffinity_np (pthread_self (), sizeof (set), &set) == 0)
+         {
+           unsigned long count;
+ 
+ # ifdef CPU_COUNT
+           /* glibc >= 2.6 has the CPU_COUNT macro.  */
+           count = CPU_COUNT (&set);
+ # else
+           size_t i;
+ 
+           count = 0;
+           for (i = 0; i < CPU_SETSIZE; i++)
+             if (CPU_ISSET (i, &set))
+               count++;
+ # endif
+           if (count > 0)
+             return count;
+         }
+       }
+ #elif HAVE_PTHREAD_AFFINITY_NP && defined __NetBSD__ && 0
+       {
+       cpuset_t *set;
+ 
+       set = cpuset_create ();
+       if (set != NULL)
+         {
+           unsigned long count = 0;
+ 
+           if (pthread_getaffinity_np (pthread_self (), cpuset_size (set), set)
+               == 0)
+             {
+               cpuid_t i;
+ 
+               for (i = 0;; i++)
+                 {
+                   int ret = cpuset_isset (i, set);
+                   if (ret < 0)
+                     break;
+                   if (ret > 0)
+                     count++;
+                 }
+             }
+           cpuset_destroy (set);
+           if (count > 0)
+             return count;
+         }
+       }
+ #elif HAVE_SCHED_GETAFFINITY /* glibc >= 2.3.4 */
+       {
+       cpu_set_t set;
+ 
+       if (sched_getaffinity (0, sizeof (set), &set) == 0)
+         {
+           unsigned long count;
+ 
+ # ifdef CPU_COUNT
+           /* glibc >= 2.6 has the CPU_COUNT macro.  */
+           count = CPU_COUNT (&set);
+ # else
+           size_t i;
+ 
+           count = 0;
+           for (i = 0; i < CPU_SETSIZE; i++)
+             if (CPU_ISSET (i, &set))
+               count++;
+ # endif
+           if (count > 0)
+             return count;
+         }
+       }
+ #elif HAVE_SCHED_GETAFFINITY_NP /* NetBSD >= 5 */
+       {
+       cpuset_t *set;
+ 
+       set = cpuset_create ();
+       if (set != NULL)
+         {
+           unsigned long count = 0;
+ 
+           if (sched_getaffinity_np (getpid (), cpuset_size (set), set) == 0)
+             {
+               cpuid_t i;
+ 
+               for (i = 0;; i++)
+                 {
+                   int ret = cpuset_isset (i, set);
+                   if (ret < 0)
+                     break;
+                   if (ret > 0)
+                     count++;
+                 }
+             }
+           cpuset_destroy (set);
+           if (count > 0)
+             return count;
+         }
+       }
+ #endif
+ 
+ #if (defined _WIN32 || defined __WIN32__) && ! defined __CYGWIN__
+       { /* This works on native Windows platforms.  */
+       DWORD_PTR process_mask;
+       DWORD_PTR system_mask;
+ 
+       if (GetProcessAffinityMask (GetCurrentProcess (),
+                                   &process_mask, &system_mask))
+         {
+           DWORD_PTR mask = process_mask;
+           unsigned long count = 0;
+ 
+           for (; mask != 0; mask = mask >> 1)
+             if (mask & 1)
+               count++;
+           if (count > 0)
+             return count;
+         }
+       }
+ #endif
+ 
  #if defined _SC_NPROCESSORS_ONLN
!       { /* This works on glibc, MacOS X 10.5, FreeBSD, AIX, OSF/1, Solaris,
!          Cygwin, Haiku.  */
!       long int nprocs = sysconf (_SC_NPROCESSORS_ONLN);
!       if (nprocs > 0)
!         return nprocs;
!       }
! #endif
!     }
!   else /* query == NPROC_ALL */
!     {
! #if defined _SC_NPROCESSORS_CONF
!       { /* This works on glibc, MacOS X 10.5, FreeBSD, AIX, OSF/1, Solaris,
!          Cygwin, Haiku.  */
!       long int nprocs = sysconf (_SC_NPROCESSORS_CONF);
!       if (nprocs > 0)
!         return nprocs;
!       }
  #endif
+     }
  
  #if HAVE_PSTAT_GETDYNAMIC
    { /* This works on HP-UX.  */
      struct pst_dynamic psd;
!     if (pstat_getdynamic (&psd, sizeof psd, 1, 0) >= 0)
!       {
!       /* The field psd_proc_cnt contains the number of active processors.
!          In newer releases of HP-UX 11, the field psd_max_proc_cnt includes
!          deactivated processors.  */
!       if (query == NPROC_CURRENT)
!         {
!           if (psd.psd_proc_cnt > 0)
!             return psd.psd_proc_cnt;
!         }
!       else
!         {
!           if (psd.psd_max_proc_cnt > 0)
!             return psd.psd_max_proc_cnt;
!         }
!       }
    }
  #endif
  
***************
*** 75,87 ****
    { /* This works on IRIX.  */
      /* MP_NPROCS yields the number of installed processors.
         MP_NAPROCS yields the number of processors available to unprivileged
!        processes.  We need the latter.  */
!     int nprocs = sysmp (MP_NAPROCS);
!     if (0 < nprocs)
        return nprocs;
    }
  #endif
  
  #if HAVE_SYSCTL && defined HW_NCPU
    { /* This works on MacOS X, FreeBSD, NetBSD, OpenBSD.  */
      int nprocs;
--- 271,289 ----
    { /* This works on IRIX.  */
      /* MP_NPROCS yields the number of installed processors.
         MP_NAPROCS yields the number of processors available to unprivileged
!        processes.  */
!     int nprocs =
!       sysmp (query == NPROC_CURRENT && getpid () != 0
!            ? MP_NAPROCS
!            : MP_NPROCS);
!     if (nprocs > 0)
        return nprocs;
    }
  #endif
  
+   /* Finally, as fallback, use the APIs that don't distinguish between
+      NPROC_CURRENT and NPROC_ALL.  */
+ 
  #if HAVE_SYSCTL && defined HW_NCPU
    { /* This works on MacOS X, FreeBSD, NetBSD, OpenBSD.  */
      int nprocs;
*** m4/nproc.m4.orig    2009-11-01 14:55:37.000000000 +0100
--- m4/nproc.m4 2009-11-01 14:31:13.000000000 +0100
***************
*** 1,4 ****
! # nproc.m4 serial 3
  dnl Copyright (C) 2009 Free Software Foundation, Inc.
  dnl This file is free software; the Free Software Foundation
  dnl gives unlimited permission to copy and/or distribute it,
--- 1,4 ----
! # nproc.m4 serial 4
  dnl Copyright (C) 2009 Free Software Foundation, Inc.
  dnl This file is free software; the Free Software Foundation
  dnl gives unlimited permission to copy and/or distribute it,
***************
*** 12,17 ****
--- 12,19 ----
  # Prerequisites of lib/nproc.c.
  AC_DEFUN([gl_PREREQ_NPROC],
  [
+   dnl Persuade glibc <sched.h> to declare CPU_SETSIZE, CPU_ISSET etc.
+   AC_REQUIRE([AC_USE_SYSTEM_EXTENSIONS])
    AC_CHECK_HEADERS([sys/pstat.h sys/sysmp.h sys/param.h],,,
      [AC_INCLUDES_DEFAULT])
    dnl <sys/sysctl.h> requires <sys/param.h> on OpenBSD 4.0.
***************
*** 21,25 ****
       # include <sys/param.h>
       #endif
      ])
!   AC_CHECK_FUNCS([pstat_getdynamic sysmp sysctl])
  ])
--- 23,28 ----
       # include <sys/param.h>
       #endif
      ])
!   AC_CHECK_FUNCS([sched_getaffinity sched_getaffinity_np \
!                   pstat_getdynamic sysmp sysctl])
  ])
*** modules/nproc.orig  2009-11-01 14:55:37.000000000 +0100
--- modules/nproc       2009-11-01 14:31:44.000000000 +0100
***************
*** 7,12 ****
--- 7,14 ----
  m4/nproc.m4
  
  Depends-on:
+ c-ctype
+ extensions
  unistd
  
  configure.ac:




reply via email to

[Prev in Thread] Current Thread [Next in Thread]