[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gomp-discuss] GOMP Requirements v1.1
Ioannis E. Venetis
Re: [Gomp-discuss] GOMP Requirements v1.1
Fri, 12 Nov 2004 19:18:16 +0200
Mozilla Thunderbird 0.9 (X11/20041109)
Scott Robert Ladd wrote:
Ioannis E. Venetis wrote:
Although I agree that this should be the default behaviour, I would be
very happy to have a way to change this behaviour during linking of an
Commercial compilers impose a threading model without providing
alternatives. OdinMP, if memory serves, allows specification of a
threading model at compile time.
I'm told that the gcj Java compiler uses the model defined by
--enable-threads=xxx during configuration. I think OpenMP will have a
better chance of success if it follows that same pattern.
Hm, now I seem to remember that this has been discussed again. Maybe you
are right about having better chances of success with this policy. But I
think that I have been misunderstood. What I really meant in my previous
mail is the following:
1) If a user gives the option -fopenmp to the compiler, linking will be
done with libgomp.
2) Suppose that I write my own library (let's say, libmygomp) which
implements the same API as the default libgomp that is distributed with
gcc. In addition to -fopenmp, maybe we could have an option
-fopenmp-lib=mygomp (and use the standard -L option to say where to look
for that library) that will link libmygomp with the executable, INSTEAD
of libgomp. This way nothing changes in the way the compiler produces
code, but we are able to link with a different library that is
implemented the way we like, with the threads we like.
In any case, someone could do that by hand, by just renaming the default
libgomp library and putting his/her own in its place. I just thought
that it would be much cleaner this way and that it would give the
opportunity to people to experiment with GCC and OpenMP.
It is my understanding that development will proceed more or less
along the lines of the document that Ross Towle has posted some time
Ross' document is a good piece of work, but it is not a complete design
document. We have several issues that need to be seriously considered
before we know all the details of hos this is going to work. In many
ways, I think we've gotten that cart before the horse; we've implemented
a support library with questionable copyright legalities, and we have
designs before we even define what all the requirements are.
Then this was my fault. Judging from the responses to the referenced
document, I concluded that this would be the way to go. If not, I
apologize for the misunderstanding I am am willing to go again in front
of the cart :-)
I'm not a stick in the mud about documentation, but I do believe we've
rushed ahead on some issues without fully discussing the consequences.
This proposed requirements document, simple as it is, has already raised
issues that people have not previously considered.
But I thought that this was the intention of this document. To attract
people and start a discussion :-) I believe that all ideas should be put
forward now. If we will implement them or not is something totally
different. But having all ideas together will make it easier to make a
good design and choose from all available options.
Changing the library to support this new scheme was quite easy and I
expect that other libraries could be changed quite easily too, to
support such a scheme.
I'm not objecting to the concept, but I have seen resistance to it in
the mainstream GCC community (mostly from people who erroneously believe
OpenMP == Java threads).
Well, this is obviously wrong. OpenMP can be implemented with a number
of different thread packages. This is something that we should explain
in the document that will discuss implementation.
I have also seen some concerns about the statement "in some cases, how
code is reorganized depends on the threading model in use". I tend to
agree with Ross on this matter, that the correct approach is to use
only generic functions and implement them in a library. From my
experience, I am convinced that most (if not all) performance issues
with multithreaded applications can be effectively solved in the
The perspective on this depends on your use of OpenMP. As Lars pointed
out, using architecture-specific techniques is an optimization necessary
for optimal performance. For production work, people are going to want
to produce optimal parallel code.
Hm, maybe you are right on this. But from my point of view, there are
basically two options for implementing OpenMP in a compiler. The one is
that the compiler generates only generic functions, as proposed, and the
library implements the rest. This is (or at least should be) easy to
implement and maintain on the side of the compiler, but makes the
implementation of the library more difficult. For optimal performance,
the library must take into account the architecture of the machine and
implement sophisticated algorithms for load-balancing, locking, etc.
The other option is to let the compiler analyze everything related to
the OpenMP directives (shared variables, reductions, etc) and let it
produce directly the best threaded code it can, without generic
functions. This is much harder to implement in the compiler, but allows
better use of all other optimizations and finally allows the library to
provide only basic functionality (create/join threads, locks, barriers,
etc). For optimal performance, it is now the compiler that must take
into account the architecture and produce for every kind of parallel
construct the best code. I have really no idea of compilers, as stated
in my first mail to this list, but I feel that this must be really very
hard to do. I really don't know up to which level a compiler is able to
do that. This is the reason why I lean towards the first solution. I
work on threaded libraries and I know that they can achieve very high
levels of performance on every kind on architecture, if correctly
designed. Maybe someone who knows from compilers can clarify this.
Of course there is also a hybrid solution, to allow only some
optimization from the side of the compiler. Again I really don't know
which these optimizations would be and, more important, how they would
interact with the library. I can't see how they can interact, therefore
replacing the default library with a custom library should actually
cause no problems.
A wild idea: Perhaps we need to consider additional tuning options, such
as "-fopenmp=generic" for link-time selection of a threading model, and
"-fopenmp=native" for platform-specific code? Such a concept increases
complexity in exchange for satisfying more people.
From the above I conclude that you had something like the hybrid model
in your mind. As stated above, I don't really understand what kind of
optimizations the compiler is able to do on its own and how they
interact with the library. If they don't interact, then replacing the
library is not a problem. As I feel that this is something very
important for the implementation, I would like to learn more on this.
Could you please clarify things for me or even better give some examples?
The real question is: Does GCC care about being competitive in terms of
performance? If intellectual freedom is the only goal, then we can by
all means approach this from a generic perspective. If we want a
compiler than is a practical alternative to commercial products, we need
to concern ourselves with performance issues.
I would like to see GCC to be competitive in terms of performance. But
as I said, performance issues can be dealt with at the library level. I
know that, I have been doing this for the last 6 years :-) It is my
feeling (just my feeling) that these problems are very difficult to be
implemented directly in the compiler. If you have another opinion,
please make it clear to me why you believe that. I just try to
understand where we are heading to and based on which facts.