octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: octave and LLVM google summer of code project


From: David Bateman
Subject: Re: octave and LLVM google summer of code project
Date: Tue, 08 Apr 2008 17:17:02 +0200
User-agent: Thunderbird 2.0.0.12 (X11/20080306)

Duncan Sands wrote:
> Hi, over at the LLVM project (http://llvm.org) we've just received
> a google summer of code project proposal from Max Rietmann who wants
> to use LLVM's just-in-time optimizers and code generation to speed
> up octave's interpreter.  I've pasted his application at the end of
> this email.  While some of us have used octave, none of the LLVM
> people know anything about its internals so it is hard for us to tell
> if this project makes sense or not.  Is this something octave would
> be interested in?  Is it do-able in 10 weeks?  Finally, if the project
> looks good it would be helpful to have an octave expert as a mentor
> as well as someone from LLVM - any suggestions?
>
> Best wishes,
>
> Duncan.
>
>   
This not only makes sense but is probably one of the major stumbling
blocks for widespread acceptance of Octave relative to Matlab (ie Matlab
has a JIT). Whether its doable in 10 weeks, well he'd have to be good to
have a chance, but even a good start that supplies a template of how to
modify the rest of the code would be a major help.. I'm sure that any
help that is requested on the Octave internals would be supplied rapidly
on the address@hidden, though as this will necessarily impact
the parser, its John Eaton who has the best understanding of how that
works..

D.



> PS: The project proposal:
>
> I will write an Octave to LLVM bytecode compiler. I plan to write it in 
> Python or C++, both of which I know well from previous personal projects. 
> Octave is an open-source numerical programming toolkit that implements 
> similar functionality to that of Mathworks Matlab. Despite recent 
> improvements, in many cases Octave runs much slower than Matlab.
>  
>  Octave developers have discussed how to remedy this and I propose to use 
> LLVM as the interpreter and use it to compile down to native code when 
> desired. Using LLVMs optimizations, Octave could become a great numerical 
> method library, speeding up development for those in the computational 
> sciences, who often write their final code in C or Fortran. Octave's Vector 
> operations make it easy to code numerical methods and by using LLVM to 
> optimize these, it will be much more competitive than it currently is - 
> Perhaps it could even surpass Matlab in speed.
>  
>  
>    Detailed Description 
> GSoC 2008 Application
>  Mentoring Organization: LLVM
>  
>  Max Rietmann
>  
>  Background: I studied electrical and computer engineering at Cornell 
> university as an undergraduate, but upon graduating I realized that my 
> talents and interests lie in a more mathematical realm. I am currently one 
> semester into a masters in applied mathematics at San Diego State University 
> (SDSU). My specific program has a concentration on (nonlinear) Dynamical 
> Systems, which can range from modeling cellular activity to finding chaotic 
> orbits around galaxies. I'm currently interested in chaotic behavior and 
> where it occurs in a phase space.
>   Much of my work is done with the help some sort of mathematical toolkit 
> like Mathematica or Matlab. For much of my numerical calculations I use 
> Matlab, but only out of impatience. Before I was able to obtain a license to 
> Matlab, I used Octave, which is an open source implementation of most of 
> Matlab's functionality. Even after getting a Matlab license, I mostly used 
> Octave because It loads into the command line faster than Matlab can load its 
> gui. Unfortunately I have found Octave to be noticeably slower than Matlab 
> (up to an order of magnitude), which has caused me to use Matlab exclusively.
>   In reading the Octave mailing list they spoke of trying to re-implement the 
> Octave runtime on something like LLVM or the JVM so as to take advantage of 
> better optimizations. For example, when Matlab encounters loops, it attempts 
> to use JIT techniques to vectorize the loop speeding up the code 
> tremendously. The Octave developers would love to be able to do this, but 
> they are unsure of how to progress forward. They are not a mentoring 
> organization this year, but LLVM is. I have read through the tutorial on 
> making a LLVM compiler for a simple language called Kaleidoscope that 
> compiles down to LLVM bytecode. After reading the tutorial, I feel that I 
> would be capable of implementing most of the Octave functionality in LLVM.
>  
>  Project Description and Outline: I will write an Octave to LLVM bytecode 
> compiler. I plan to write it in Python or C++, both of which I know well from 
> previous personal projects. My preference language preference is Python, but 
> since one of the tutorials is using C++ and since Octave is written using 
> mostly C++, I might use that to be consistent.
>   My Knowledge of the Octave backend is such: It parses the language and 
> generates its own internal representation. The interpreter then runs through 
> the representation and makes calls to various numerical libraries when 
> appropriate. For example, it uses libraries lapack and fftpack for matrix and 
> fft operations, which are both written in fortran. From discussions on the 
> mailing list, its clear that Octave's speed problems do not lie in its lack 
> of ability in numerical operations, but instead are due to slow 
> interpretation of the base language. I specifically notice the biggest speed 
> issues when looping. In some cases, JIT compilation could be used to 
> vectorize the looping code. But this still does not account for its 
> relatively slow looping over non-vectorizeable code.
>   Because it uses a large number of libraries to execute its numerical 
> subroutines, I think that LLVM makes a wonderful choice as a bytecode 
> solution and runtime. From my readings, calling to external libraries from 
> LLVM is relatively straightforward and most importantly fast.
>   Despite my excitement for this project, I am less knowledgeable about 
> compiler-design as I want to be. There are a lot of details I do not yet 
> understand or even know about, but I will try to outline my plan in as much 
> detail as I can.
>   The first big step will be to write a parser for the Octave code. From my 
> own source-code digging, it seems that Octave uses a parser-generator to 
> parse its code and generate an op-code style language for the interpreter. It 
> has a large library of op-codes (66 in the OPERATORS folder). Many of the 
> op-codes are matrix based and converting matrix operations to LLVM bytecode 
> will be a significant challenge. Because Octave has its own op-code language 
> I imagine it will be difficult to use much existing code to make a compiler 
> for LLVM. My parser will have to interpret the Matrix operations from Octave 
> and appropriately break them apart to either call the appropriate numerics 
> toolbox or generate the functional LLVM code.
>   So instead of overwhelming myself talking about Matrix operations I will 
> describe the easy part. First, I will implement the basic functionality of 
> the language leaving out vectors and matrices to learn more about LLVM and 
> writing a parser. I've written a parser tree in Objective-C for a math 
> program I started, so I already have some ideas about how that will work. I 
> was planning on writing my own parser, but I might instead use a parser 
> generator to generate the data-object-tree (I think they call it an AST). 
> From there, I will need to build a tree to LLVM bytecode generator. For the 
> basic operations without vectors and matrices this should be straightforward. 
> Once this is all implemented and working I will begin with adding vectors. 
> Mathematically, vectors are just a type of Matrix, but I am unsure if 
> generalizing this much will make the fastest code. I imagine that I will 
> handle vectors separately and optimize them by hand. Next will be matrix 
> operations. Since octave u!
>  ses a matrix library to "outsource" its matrix operations, I would also try 
> to do something similar. I will try to use the same libraries Octave uses, 
> but I do not know much about calling libraries from LLVM, that will take 
> research and advice from the LLVM devs.
>   I think the biggest thing for me to keep in mind is looking at the project 
> incrementally. First start simple with the parser and basic code like:
>  * simple assignment: x=2;
>  * all the various arithmetic ops: +,-,*,/
>  * control structures (if,then,else)
>  * looping (for,while)
>  For loops use vectors, so I would have to implement a basic vector to do 
> this.
>  * Function definition and calling
>  With these built, one would be able to build most programs sans 
> vector/matrix operations.
>  Now I add:
>  *vectors as a variable type and all that comes with that:
>    * Vector addition/multiplication x=v1+v2, x=v1*v2 and x=v1.*v2
>  (the .* multiplication means to multiply corresponding entries)
>  The final step will be the Matrix operations, which are an entirely new bag 
> of tricks.
>  
>  Overall Goals: My main goal for the end of the summer is to have the Octave 
> language running on LLVM. Improving Octave's speed is the motivation for the 
> project, but that can always come later. I hope to lay the ground work for 
> the Octave team to pick up my LLVM compiler, develop it further and migrate 
> the current runtime from its C++ roots to LLVM. I see Octave's speed as a 
> major drawback and helping to fix this would not only help me in my academic 
> pursuits, it might make Octave more mainstream, which will save buying (or 
> pirating in many cases) Matlab licenses by those poor starving engineering 
> students of the world.
>  
>  Can I do this?: I have quite a lot of experience coding and I've worked in 
> assembly before. I also know the octave (and matlab) code very well because 
> of my coursework. I also have the summer free and would use the stipend from 
> Google as my summer job. I would spend the summer coding with breaks to go 
> surfing and climbing. I don't have a specific mentor picked out, but I 
> imagine that the existing mentors from LLVM would be extremely helpful and 
> that the Octave developers would be extremely excited to have someone like me 
> lay the groundwork for restructuring the Octave runtime. As I said, they've 
> already mentioned this on the Octave mailing list, so I'm sure there are 
> Octave developers who would love to help when I need advice and to help test. 
> The tutorials on the LLVM website should get me up and running quickly and 
> introduce me to the more difficult concepts that I am still learning about.
>   One positive part of this project I see is that its relatively clear way 
> forward. The Octave language is well documented and my work at the beginning 
> will just be to implement each part of the language in LLVM. I start first 
> with arithmetic operators (+,-,*,/) and move on to variables, functions, 
> loops, and then finally to vectors and matrices. Because the LLVM language 
> makes writing a compiler so much easier, I really feel like I can at least 
> get the basics working relatively quickly.
>
>
>   


-- 
David Bateman                                address@hidden
Motorola Labs - Paris                        +33 1 69 35 48 04 (Ph) 
Parc Les Algorithmes, Commune de St Aubin    +33 6 72 01 06 33 (Mob) 
91193 Gif-Sur-Yvette FRANCE                  +33 1 69 35 77 01 (Fax) 

The information contained in this communication has been classified as: 

[x] General Business Information 
[ ] Motorola Internal Use Only 
[ ] Motorola Confidential Proprietary





reply via email to

[Prev in Thread] Current Thread [Next in Thread]