[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Tinycc-devel] Speed of development of a compiler.
From: |
AlexandreFressange |
Subject: |
Re: [Tinycc-devel] Speed of development of a compiler. |
Date: |
Tue, 24 Nov 2015 22:07:20 +0100 |
Maybe I should convert the language to C and let gcc/llvm do the rest. And
increase the budget on machines so that those compilers fit.
That could be written in days/weeks (again simple language leading to simple c
code) and benefit from the gcc framework with passing the right flags.
I guess it is much more realistic as for now.
--
Alex
24.11.2015, 21:14, "AlexandreFressange" <address@hidden>:
> Thanks a lot Basile,
>
> this is exactly the kind of answer I was expecting: digging around the
> problem, showing it with much intelligence and proposing multiple ways to
> solve it.
>
> Now, I am about to finish my OS that will not be open source (no need, not a
> community project), I was working on some performance optimizations for
> memory management. I have been coding for 15 years now, but my initial
> formation is unrelated with CS. This OS is for very specific purposes that
> are currently being developed for a startup (I won't say much more right now,
> right here, but will happily do in a few time). The language resembles python
> but is even simpler (but has pointers; we always need them :) ),follows no
> ISO, steals good ideas on C( atomics and so on), communicate liberally with
> the OS; I've rewritten the musl library to fit the OS needs.
>
> I am used to gas and nasm for x86 assembly coding.
>
> I guess you are right, I should create a github for it and its compiler.
>
> Why I need that: there will be a need to update the code in environment
> without internet, and llvm/gcc are too heavyweight. More, simpler language
> can lead to fewer bugs.
> And to say the least, I don't want to be dependent on C++. It narrows the
> choice much.
>
> I seriously considered the need for a new compiler, when others are existing.
> But on the other hand, I am curious to see if I can create a lightweight
> alternative that would only fit my os (on those archs) and language.
>
> I concur, that is a crazy task.
>
> I've been thinking and designing the whole thing starting last year, at the
> same time I started the OS project (which borrows some pieces with freebsd,
> but not much). But, as you know, a compiler is.. something.
>
> My participations on forums are very limited. the stackoverflow has already a
> great deal of answers actually. I prefer the exchanges we can have on mailing
> lists (easier to express/discuss ideas off the crowd).
>
> --
> Alex
>
> 24.11.2015, 19:30, "Basile Starynkevitch" <address@hidden>:
>> On 11/24/2015 05:09 PM, AlexandreFressange wrote:
>>> Hello,
>>>
>>> I saw the dates on the tcc page and wonder how much time it
>>> *realistically* take to create a compiler supporting one simple language
>>> like C (but not C) and two architectures (x86_64 and arm).
>>
>> You should tell much more, and more precisely, what exactly is the
>> language you want to code your compiler for, on what architectures and
>> what operating systems (both host & target).
>>
>>> The optimizing part is obviously the biggest "issue" (<-> skills). I
>>> hack kernels and have a pretty good understanding of the optimizations out
>>> there and low level stuffs. As well as readings on the compiler
>>> optimization subject.
>>
>> You should tell what kind of optimization you want. I blindly guess that
>> you want performance similar to the code produced by gcc -O1. You should
>> also tell a lot more about your skills (what programs have you written,
>> what studies have you done, what programming languages and operating
>> systems do you know very well, what forums are you participating in,
>> ....). Remember that programming is hard, and everyone needs at least
>> ten years to learn it. http://norvig.com/21-days.html
>>
>>> There isn't one answer to this question, really. I basically need your
>>> experience/opinion on this. From insiders.
>>
>> It depends upon your skills, your objectives, and to a lesser extent the
>> tools or languages you are using. But I imagine you'll need (assuming
>> you have a small team of 3 to 5 persons working full time with you)
>> several years (more than 5, less than 15) to get a C99 (nearly)
>> compliant compiler able to produce, for x86-64/Linux (e.g. most PCs
>> running some recent Debian distribution) and ARM/Linux (e.g. a
>> RaspberryPi running some Debian), some object code about as efficient as
>> GCC5 is producing with -O1.
>> This is still a guess, but a bit of an educated one.
>>
>> Look for example into CompCert. http://compcert.inria.fr/compcert-C.html
>> it is not free software, but the source code is available for academic
>> usage; Xavier Leroy is probably the brightest computer scientist in
>> activity that I have met (and worked with) in person. AFAIK he is
>> working for 8 to 10 years on Compcert (and there also other bright
>> people and top-class research scientists). Of course, he is also
>> teaching and publishing papers, and advising PhD students.
>>
>> Look also into TinyCC and http://nwcc.sourceforge.net/ ; they probably
>> don't support all of C99. They surely are producing slower code than GCC
>> does with -O1 (probably object code that is pathetically slower by more
>> than a factor of 3x w.r.t `gcc -O1`) And both have been developed during
>> several years (albeit by a single person initially).
>>
>> I am working on MELT, see http://gcc-melt.org/ for more. It is mostly a
>> simple domain specific language (and GPLv3 free software) to hack and
>> customize GCC, in principle not a big deal. But I am working nearly full
>> time (mostly alone, with some very minor outside contributions) on it
>> since 2009. Look notably into the documentation available on
>> http://gcc-melt.org/docum.html since I have hundreds of slides and many
>> links there relevant to compilers. Please read some of them, they will
>> be useful to you (in particular to explain that a compiler is not mostly
>> parsing).
>>
>> If you care about designing a programming language which can have some
>> (compiled implementation providing an) ABI compatible with C and if you
>> have (as I do) more fun in designing the language than in coding a
>> compiler ex-nihilo for it, do yourself a favor, base your work on
>> existing compilers. You could generate C code (many compilers are doing
>> that, see http://programmers.stackexchange.com/a/273895/40065 &
>> http://programmers.stackexchange.com/a/257873/40065 etc...); you could
>> use GCCJIT https://gcc.gnu.org/onlinedocs/jit/ or LLVM http://llvm.org/
>> or even some simpler JIT libraries like libjit...).
>> You could use or generate Common Lisp & SBCL, see http://sbcl.org/; you
>> could generate Java or JVM bytecode. You could generate Ocaml or D or Go
>> code. But don't lose your time on low-level optimizations, but leverage
>> your work on existing compilers or libraries doing that and focus on the
>> programming language and the front-end (the backend being the existing
>> tool: a C compiler if you generate C, an Ocaml compiler if you generate
>> Ocaml, or GCCJIT or LLVM or libjit, etc...). IMHO generating C++ is
>> generally not worth the effort (unless you have to).
>>
>> But I guess that you want to make some C compiler for ARM & x86-64.
>>
>> So some suggestions:
>>
>> first, make your compiler a free software from day 0. Start with an
>> empty github project today.
>> (There is absolutely no market for any proprietary compiler; if I am
>> wrong, you already have found the several millions of euros in venture
>> capital to fund your project, and you won't ask here). At the very
>> least, you'll be able to show your work, ask for help (e.g. specific
>> technical questions on StackOverflow or other forums), and perhaps
>> attract other contributor(s) and get some feedback by nice people
>> testing your thing. Notice that there is no much proprietary compilers
>> today (even Microsoft is opensourcing theirs)!
>>
>> then, read entirely an ISO C standard (either C99 or C11) and some other
>> reference manual about languages you like You'll be able to download
>> their latest C99 or C11 or C++11 draft from the web (see wikipedia pages
>> on C99 or C11).
>>
>> Study some existing compilers, and notably their internal
>> representations (IR). Read at least about Gimple & Tree in GCC, and
>> about Clang and LLVM. Understand that IR is a hard point, and that
>> optimization passes are mostly IR -> IR transformations. The bulk of the
>> work of any compiler is not its parser (building some AST from source),
>> or its code generator (emitting assmbler from an IR) but the
>> optimization passes which are transforming some IR into another IR (very
>> often, both source and destination IR of a given optimization pass are
>> of the same type, and GCC has hundreds of such passes!)
>>
>> Decide also in what programming language you'll code your compiler. This
>> is a difficult decision. Some points.
>>
>> I don't think that coding your compiler in manually written C is
>> worthwhile. You won't do better than TinyCC or nwcc for several years.
>> And you probably won't have much fun. But if you do, start by building a
>> compiler infrastructure: you'll need an efficient memory manager, and
>> that practically means a garbage collector (able to deal with all the
>> circular references any compiler has to work with). you'll need nice
>> dumping routines to print IR and any internal data. you might need some
>> persistence machinery (maybe as simple as storing some IR in JSON format
>> in sqlite). You could want to make a multithreaded compiler (there is
>> none AFAIK, and I believe it is useful today, but to code any kind of
>> multithreaded compiler you need to start from scratch.). So for the
>> first year, work on the infrastructure, not on the compiler itself.
>>
>> You could choose some higher-level language to code your compiler in.
>> I've got some opinions and hints on that.
>>
>> You could code in Scheme, or Javascript, or Common Lisp or some other
>> dynamically typed language (avoid Python or Perl, it is probably too
>> slow). The dynamic typing, the garbage collecting, is a huge plus.
>> You might perhaps choose some implementation which generates C code or
>> which is written in C. For example, if using Scheme, consider Bigloo or
>> Chicken. Both are generating C code, and that generated C code is a very
>> good test for your own compiler (this is one of the interest of
>> bootstrapping compilers, and it is a very significant one).
>>
>> You could code in Ocaml or in Haskell or some other statically typed
>> functional language with type inference. The type inference machinery
>> would help finding simple bugs (but not hard ones). The functional
>> aspect (which Javascript, Scheme & Lisp also have) is essential: you'll
>> use functional values to code future computations (read more about
>> continuation & continuation passing style, start with wikipedia). The
>> garbage collection is a must.
>>
>> You could design your own domain specific language or DSL (exactly like
>> I did for MELT). If you want to code a C compiler, I strongly invite you
>> to think that way. Notice that GCC itself has about a dozen of
>> specialized C (or C++) code generators which are generating parts of the
>> compiler, and you might look at that as saying that GCC has a dozen of
>> DSLs inside it (even if most of GCC code is sadly C++). You might even
>> design yourDSL and implement a yourDSL->C translator (that takes at
>> least one year but it is fun, notably if you start from scratch). Then
>> the generated C code will be a very good testcase for your C compiler.
>>
>> If you have not read them, I recommend reading several books.
>>
>> SICP https://mitpress.mit.edu/sicp/ is an absolute must; if you only
>> read one book, read this one
>>
>> Concepts, Techniques and Models of Computer Programming
>> https://www.info.ucl.ac.be/~pvr/book.html
>>
>> Lisp In Small Pieces
>> https://pages.lip6.fr/Christian.Queinnec/WWW/LiSP.html ; if you read
>> French, read the latest french version from ParaCampus editor
>>
>> Programming Language Pragmatics
>> http://www.cs.rochester.edu/~scott/pragmatics/
>>
>> Artificial Beings: The Conscience of a Conscious Machine
>> http://eu.wiley.com/WileyCDA/WileyTitle/productCd-1848211015.html ; this
>> book by J.Pitrat is apparently far from compilation, but it thought
>> provoking and much more relevant to programming languages design that
>> the title is suggesting. Read also his blog on
>> http://bootstrappingartificialintelligence.fr/WordPress3/
>>
>> Hope this help. I'm waiting to read more about your skills and your
>> languages and your efforts, and other opinions on the subject.
>>
>> BTW, if you are young enough, find some PhD where your goals could fit.
>>
>> Cheers
>>
>> --
>> Basile STARYNKEVITCH http://starynkevitch.net/Basile/
>> email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359
>> 8, rue de la Faiencerie, 92340 Bourg La Reine, France
>> *** opinions {are only mine, sont seulement les miennes} ***
>
> _______________________________________________
> Tinycc-devel mailing list
> address@hidden
> https://lists.nongnu.org/mailman/listinfo/tinycc-devel
- [Tinycc-devel] Speed of development of a compiler., AlexandreFressange, 2015/11/24
- Re: [Tinycc-devel] Speed of development of a compiler., Basile Starynkevitch, 2015/11/24
- Re: [Tinycc-devel] Speed of development of a compiler., AlexandreFressange, 2015/11/24
- Re: [Tinycc-devel] Speed of development of a compiler.,
AlexandreFressange <=
- Re: [Tinycc-devel] Speed of development of a compiler., Charles Anthony, 2015/11/24
- Re: [Tinycc-devel] Speed of development of a compiler., Basile Starynkevitch, 2015/11/24
- Re: [Tinycc-devel] Speed of development of a compiler., Basile Starynkevitch, 2015/11/25
- [Tinycc-devel] RE :Re: Speed of development of a compiler., Christian JULLIEN, 2015/11/25
- Re: [Tinycc-devel] Speed of development of a compiler., AlexandreFressange, 2015/11/25