[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: cuda compilation
From: |
Tomas Oberhuber |
Subject: |
Re: cuda compilation |
Date: |
Fri, 8 Jan 2010 16:33:02 +0100 |
User-agent: |
KMail/1.12.2 (Linux/2.6.31-14-generic; KDE/4.3.2; x86_64; ; ) |
Hi Ralph,
Dne středa 06 Leden 2010 08:44:57 Ralf Wildenhues napsal(a):
> Hello Tomas,
>
> * Tomas Oberhuber wrote on Sat, Jan 02, 2010 at 11:33:46AM CET:
> > Now I try to compile whole project with nvcc. It seems to work but I get
> > this
> >
> > ibtool: link:
> > nvcc -shared -nostdlib
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugStructu
> >re.o .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-parse.o
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugGroup.o
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugParser.
> >o .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebug.o
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugScanner
> >.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlParameterConta
> >iner.o .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlString.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTimerCPU.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTimerRT.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript
> >ion.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript
> >ionScanner.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-mpi-supp.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTester.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-parse.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript
> >ionParser.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlObject.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-compress-file.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-mfilename.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlLogger.o
> > .libs/libtnl-0.1.lax/libtnlmatrix-0.1.a/libtnlmatrix_0_1_la-tnlBaseMatrix
> >.o -L/usr/local/cuda/lib64 -lcppunit -lcudart -Wl,-soname
> > -Wl,libtnl-0.1.so.0 -o .libs/libtnl-0.1.so.0.0.0 nvcc fatal : Unknown
> > option 'nostdlib'
> >
> > which means that nvcc is also used as linker. Even if I remove -nostdlib,
> > nvcc complains about other parameters. So I think it would be better to
> > link with g++. Can I change linker somehow? And in that case if I do it
> > by hand (copy the command on the command line and replace nvcc by g++) I
> > get this
> >
> > g++ -shared -nostdlib
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugStructu
> >re.o .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-parse.o
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugGroup.o
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugParser.
> >o .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebug.o
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugScanner
> >.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlParameterConta
> >iner.o .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlString.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTimerCPU.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTimerRT.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript
> >ion.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript
> >ionScanner.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-mpi-supp.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlTester.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-parse.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlConfigDescript
> >ionParser.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlObject.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-compress-file.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-mfilename.o
> > .libs/libtnl-0.1.lax/libtnlcore-0.1.a/libtnlcore_0_1_la-tnlLogger.o
> > .libs/libtnl-0.1.lax/libtnlmatrix-0.1.a/libtnlmatrix_0_1_la-tnlBaseMatrix
> >.o -L/usr/local/cuda/lib64 -lcppunit -lcudart -Wl,-soname
> > -Wl,libtnl-0.1.so.0 -o .libs/libtnl-0.1.so.0.0.0 /usr/bin/ld:
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugStructu
> >re.o: relocation R_X86_64_32 against `.rodata.str1.1' can not be used when
> > making a shared object; recompile with -fPIC
> > .libs/libtnl-0.1.lax/libtnldebug-0.1.a/libtnldebug_0_1_la-tnlDebugStructu
> >re.o: could not read symbols: Bad value
> > collect2: ld returned 1 exit status
> >
> > Or maybe we can solve it using -Xcompiler nad -Xlinker. May I ask what
> > does libtool do now in case we use nvcc to compile or link?
>
> You're right. Libtool doesn't support CXX=nvcc yet, and we also forgot
> some bits of CC=nvcc support. This still needs to be done in Libtool.
>
> Thanks,
> Ralf
>
after my last patch to 'fix' compilation with CUDA I was working on problem
with dependencies. It seems to be solved completely, however, the solution is
not the most elegant I would imagine.
First I have added this to depcomp in automake:
diff -r automake/lib/depcomp
/home/oberhuber/workspace/automake-1.11.1/lib/depcomp
124a125,147
> nvcc)
> ## nVidia CUDA 2.3 compiler combined with gcc3
> ## here we just add -Xcompiler parameter to pass
> ## gcc3 parameters to gcc3
> for arg
> do
> case $arg in
> -c) set fnord "$@" -Xcompiler -MT -Xcompiler "$object" -Xcompiler -MD -
Xcompiler -MP -Xcompiler -MF -Xcompiler "$tmpdepfile" "$arg" ;;
> *) set fnord "$@" "$arg" ;;
> esac
> shift # fnord
> shift # $arg
> done
> "$@"
> stat=$?
> if test $stat -eq 0; then :
> else
> rm -f "$tmpdepfile"
> exit $stat
> fi
> mv "$tmpdepfile" "$depfile"
> ;;
>
It is good for ./configure ti find out that
"checking dependency style of nvcc... nvcc"
As I learned then gcc3 does not use depcomp but instead it supports fast
dependencies - fastdep. Therefore I introduced fastdepnvcc as follows
diff -r automake/m4/depend.m4
/home/oberhuber/workspace/automake-1.11.1/m4/depend.m4
155a156,158
> AM_CONDITIONAL([am__fastdepnvcc$1], [
> test "x$enable_dependency_tracking" != xno \
> && test "$am_cv_$1_dependencies_compiler_type" = nvcc])
The idea now was to generate same piece od code to makefiles as for gcc but
with -Xcompiler inside - like this
diff -r automake/lib/am/depend2.am
/home/oberhuber/workspace/automake-1.11.1/lib/am/depend2.am
73a74,84
> if %FASTDEPNVCC%
> ## Fast-dep mode for nvcc is similar to gcc
> ## We just add -Xcompiler flag.
> ?!GENERIC? %VERBOSE%%COMPILE% -Xcompiler -MT -Xcompiler %OBJ% -Xcompiler
> -MD
-Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJ%
%SOURCEFLAG%`test -f '%SOURCE%' || echo '$(srcdir)/'`%SOURCE%
> ?!GENERIC? %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po
> ?GENERIC??!SUBDIROBJ? %VERBOSE%%COMPILE% -Xcompiler -MT -Xcompiler %OBJ% -
Xcompiler -MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o
%OBJ% %SOURCEFLAG%%SOURCE%
> ?GENERIC??!SUBDIROBJ? %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po
> ?GENERIC??SUBDIROBJ? %VERBOSE%depbase=`echo %OBJ% | sed 's|[^/]*$$|
$(DEPDIR)/&|;s|\.o$$||'`;\
> ?GENERIC??SUBDIROBJ? %COMPILE% -Xcompiler -MT -Xcompiler %OBJ% -Xcompiler -
MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJ%
%SOURCEFLAG%%SOURCE% &&\
> ?GENERIC??SUBDIROBJ? $(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po
> else !%FASTDEPNVCC%
86a98
> endif !%FASTDEPNVCC%
88a101
>
101a115,125
> if %FASTDEPNVCC%
> ## In fast-dep mode, we can always use -o.
> ## For non-suffix rules, we must emulate a VPATH search on %SOURCE%.
> ?!GENERIC? %VERBOSE%%COMPILE% -Xcompiler -MT -Xcompiler %OBJOBJ%
> -Xcompiler
-MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJOBJ%
%SOURCEFLAG%`if test -f '%SOURCE%'; then $(CYGPATH_W) '%SOURCE%'; else
$(CYGPATH_W) '$(srcdir)/%SOURCE%'; fi`
> ?!GENERIC? %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po
> ?GENERIC??!SUBDIROBJ? %VERBOSE%%COMPILE% -Xcompiler -MT -Xcompiler %OBJOBJ%
-Xcompiler -MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o
%OBJOBJ% %SOURCEFLAG%`$(CYGPATH_W) '%SOURCE%'`
> ?GENERIC??!SUBDIROBJ? %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po
> ?GENERIC??SUBDIROBJ? %VERBOSE%depbase=`echo %OBJ% | sed 's|[^/]*$$|
$(DEPDIR)/&|;s|\.obj$$||'`;\
> ?GENERIC??SUBDIROBJ? %COMPILE% -Xcompiler -MT -Xcompiler %OBJOBJ% -Xcompiler
-MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %OBJOBJ%
%SOURCEFLAG%`$(CYGPATH_W) '%SOURCE%'` &&\
> ?GENERIC??SUBDIROBJ? $(am__mv) %DEPBASE%.Tpo %DEPBASE%.Po
> else !%FASTDEPNVCC%
114a139
> endif !%FASTDEPNVCC%
131a157,166
> if %FASTDEPNVCC%
> ## fast-dep mode for nvcc only add -Xcompiler
> ?!GENERIC? %VERBOSE%%LTCOMPILE% -Xcompiler -MT -Xcompiler %LTOBJ%
> -Xcompiler
-MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o %LTOBJ%
%SOURCEFLAG%`test -f '%SOURCE%' || echo '$(srcdir)/'`%SOURCE%
> ?!GENERIC? %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Plo
> ?GENERIC??!SUBDIROBJ? %VERBOSE%%LTCOMPILE% -Xcompiler -MT -Xcompiler %LTOBJ%
-Xcompiler -MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o
%LTOBJ% %SOURCEFLAG%%SOURCE%
> ?GENERIC??!SUBDIROBJ? %SILENT%$(am__mv) %DEPBASE%.Tpo %DEPBASE%.Plo
> ?GENERIC??SUBDIROBJ? %VERBOSE%depbase=`echo %OBJ% | sed 's|[^/]*$$|
$(DEPDIR)/&|;s|\.lo$$||'`;\
> ?GENERIC??SUBDIROBJ? %LTCOMPILE% -Xcompiler -MT -Xcompiler %LTOBJ% -
Xcompiler -MD -Xcompiler -MP -Xcompiler -MF -Xcompiler %DEPBASE%.Tpo %-c% -o
%LTOBJ% %SOURCEFLAG%%SOURCE% &&\
> ?GENERIC??SUBDIROBJ? $(am__mv) %DEPBASE%.Tpo %DEPBASE%.Plo
> else !%FASTDEPNVCC%
140a176
> endif !%FASTDEPNVCC%
It was also necessary to introduce am__fastdepnvcc to automake:
diff -r automake/automake.in
/home/oberhuber/workspace/automake-1.11.1/automake.in
1392,1395c1378,1387
< my ($AMDEP, $FASTDEP) =
< (option 'no-dependencies' || $lang->autodep eq 'no')
< ? ('FALSE', 'FALSE') : ('AMDEP', "am__fastdep$fpfx");
<
---
> # my ($AMDEP, $FASTDEP, $FASTDEPNVCC) =
> # (option 'no-dependencies' || $lang->autodep eq 'no')
> # ? ('FALSE', 'FALSE', 'FALSE' ) : ('AMDEP', "am__fastdep$fpfx",
"am__fastdepnvcc$fpfx");
> #
> # print $FASTDEPNVCC
>
> my ($AMDEP, $FASTDEP, $FASTDEPNVCC) =
> (option 'no-dependencies' || $lang->autodep eq 'no')
> ? ('FALSE', 'FALSE', 'FALSE' ) : ('AMDEP', "am__fastdep$fpfx",
"am__fastdepnvcc$fpfx");
>
1403a1396
> 'FASTDEPNVCC' => $FASTDEPNVCC,
6369a6340
> am__fastdepnvccCC => 'AC_PROG_CC',
6371a6343
> am__fastdepnvccCXX => 'AC_PROG_CXX',
At this moment I had correct Makefile but -Xcompiler argument was filtered out
by libtool. I fixed it like this:
diff -r libtool/libltdl/config/ltmain.m4sh
/home/oberhuber/workspace/libtool-2.2.7a/libltdl/config/ltmain.m4sh
724,727c724,729
< -Xcompiler)
< arg_mode=arg # the next one goes into the "base_compile" arg list
< continue # The current "srcfile" will either be retained or
< ;; # replaced later. I would guess that would be a bug.
---
> # -Xcompiler)
> # arg_mode=arg # the next one goes into the "base_compile" arg list
> # continue # The current "srcfile" will either be retained or
> # ;; # replaced later. I would guess that would be a bug.
> # I think that this is a bug. Usualy we wnat to pass this to nvcc
which
> # then pass the next arg to gcc.
Now I was able to competely compile my project and hope that I had correct
dependencies. However I found out, that my .cu sources are still omitted by
automake. I did not understand your sugestion (resp. how to do it).
>Alternatively, you could write a .cu.lo rule that looks like the
>automake-generated .c.lo rule, has --tag=CC but uses $(NVCC); you'd then
>still need a nvcc-wrapper that translates '-fPIC' to '-Xcompiler -fPIC'
>for nvcc. Ugly, yes, but I'm not sure how to do this any nicer at the
>moment.
so I just told automake, that .cu files can be accepted by CXX comiler.
diff -r automake/automake.in
/home/oberhuber/workspace/automake-1.11.1/automake.in
766c766
< 'extensions' => ['.c++', '.cc', '.cpp', '.cxx', '.C']);
---
> 'extensions' => ['.c++', '.cc', '.cpp', '.cxx', '.C',
> '.cu']);
It seems to work now but I see that it is not very clear solution. I would
prefer to introduce new language CUDA C and CUDA C++ fro example like this:
register_language ('name' => 'nvc',
'Name' => 'CUDA C',
'config_vars' => ['NVCC'],
'ansi' => 1,
'autodep' => '',
'flags' => ['NVCFLAGS', 'NVCPPFLAGS'],
'ccer' => 'NVCC',
'compiler' => 'COMPILE',
'compile' => '$(NVCC) $(DEFS) $(DEFAULT_INCLUDES)
$(INCLUDES)
$(AM_CPPFLAGS) $(NVCPPFLAGS) $(AM_CFLAGS) $(NVCFLAGS)',
'lder' => 'CCLD',
'ld' => '$(CC)',
'linker' => 'LINK',
'link' => '$(CCLD) $(AM_CFLAGS) $(CFLAGS) $(AM_LDFLAGS)
$(LDFLAGS) -o
$@',
'compile_flag' => '-c',
'libtool_tag' => 'NVCC',
'extensions' => ['.cu'],
'_finish' => \&lang_c_finish);
and then compil only .cu files with nvcc. I have tried to do so but it would
require much more work. I am willing to do it if someone would guide me. I
think that some autoconf tests like AC_PROG_NVCC and AC_PROG_NVCXX might be
useful. These test may define NVCC and NVCXX variables. Probably we should
start here. I would be glad if you could incorporate my patches eventhough I
know they are not very nice :). If you have any other sugestions I would be
glad to read them.
Cheers Tomas.