axiom-developer
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Axiom-developer] No progress on notangle


From: C Y
Subject: Re: [Axiom-developer] No progress on notangle
Date: Sat, 17 Feb 2007 21:50:03 -0800 (PST)

--- Waldek Hebisch <address@hidden> wrote:

> Fast way to scan strings is to use finite automaton.  The code below
> shows how this can be done in Lisp. 

Eeeep.  I'm impressed!

> - The main function 'scan-for-chunks' is a rat nest of gotos.
>   But that is correct because finite automaton _is_ a rat nest
>   of transition and one of the simplest ways to implement
>   transitions is to use gotos.  One can make it shorter by
>   using apropritate macros.  Ideally the automaton code should
>   be mechanically generated -- writing in C I would use flex
>   (the task is trivial for flex, but no so trivial by hand).

Perhaps they could be prototyped in flex and then mapped to the Lisp
macro system?

This raises a question for me - I know Tim has need of a very fast
notangle for his work, but does Axiom actually need this kind of speed?
 The complexity of this code makes me wonder of the extra 30 seconds in
Axiom build time are worth the added obfuscation.  I suppose if we get
the macros written it won't be so formidible though...

I did a little bit of analysis, which Waldek's work makes more or less
moot but I'll add on anyway in case it winds up being of use to someone
else who is interested in basic profiling.  One point - unless I'm
misunderstanding what the finite automaton solution is, this analysis
actually does point in the correct optimization direction - which is to
say, not read-line or reading speed but rather string manipulation
speed.  Perhaps similar (but more advanced) study will wind up being
useful for other parts of Axiom.

Among the interesting features of sbcl, it turns out there is a neat
little utility called sb-profile that lets one watch how much time
functions take over time.  I tried it using v0.3 of the cl-notangle,
and it is suggesting that it's not only (or even primarily) read-line
that is taking the time.

Here's what I did:

SETUP> (load "/home/cyapp/cl-notangle-v0.3.lisp")
T
SETUP> (sb-profile:profile MAKE-NEW-CHUNK GET-CHUNK-CONTENTS
ADD-TO-CHUNK-CONTENTS READ-IN-CHUNK CHECK-FOR-TOPLEVEL-CHUNK
CHECK-FOR-REFERENCED-CHUNK LOAD-PAMPHLET-FILE WRITE-OUTPUT-FILE)
; No value

That sets up the profiling on the defined functions.  I then run my
benchmark test, which is essentially all of boot and interp:

SETUP> (load "./setup/runlispnotangle.lisp")
T

This takes anywhere from 10 to 13 seconds, bases on earlier testing. 
Looking at the report:

SETUP> (sb-profile:report)
  seconds  |    consed   |  calls |  sec/call  |  name  
----------------------------------------------------------
     5.369 | 101,785,848 |    761 |   0.007055 |
CHECK-FOR-REFERENCED-CHUNK
     4.869 |  72,160,120 |    802 |   0.006072 | READ-IN-CHUNK
     0.378 |   2,271,408 |    207 |   0.001826 | WRITE-OUTPUT-FILE
     0.204 |   4,750,544 |    207 |   0.000985 | LOAD-PAMPHLET-FILE
     0.066 |   1,925,024 | 10,731 |   0.000006 |
CHECK-FOR-TOPLEVEL-CHUNK
     0.031 |     296,264 |    802 |   0.000039 | MAKE-NEW-CHUNK
     0.019 |           0 |  1,557 |   0.000012 | GET-CHUNK-CONTENTS
     0.010 |      20,480 |    800 |   0.000013 | ADD-TO-CHUNK-CONTENTS
----------------------------------------------------------
    10.946 | 183,209,688 | 15,867 |            | Total

estimated total profiling overhead: 0.03 seconds
overhead estimation parameters:
  2.4e-8s/call, 2.088e-6s total profiling, 9.44e-7s internal profiling
; No value

interesting things immediately jump out - the majority of the read-line
calls are in read-in-chunk and load-pamphlet-file, but they account for
only about half the time.  Admittedly that's already as long as noweb
takes to finish the job, but the real killer is
CHECK-FOR-REFERENCED-CHUNK.  So, since read-line was the initial
suspect, we take the instrumenting one step further and add some more
basic functions:

SETUP> (sb-profile:reset)
NIL
SETUP> (sb-profile:profile read-line string-equal concatenate search)
; No value

We run the identical test:

SETUP> (load "./setup/runlispnotangle.lisp")
T

and get a new report:

SETUP> (sb-profile:report)
  seconds  |    consed   |  calls  |  sec/call  |  name  
-----------------------------------------------------------
     5.533 |  89,786,440 |  11,880 |   0.000466 | CONCATENATE
     2.402 |           0 | 160,425 |   0.000015 | SEARCH
     1.971 |  49,896,872 |     761 |   0.002590 |
CHECK-FOR-REFERENCED-CHUNK
     0.445 |  31,027,600 | 168,558 |   0.000003 | READ-LINE
     0.308 |   1,564,872 |     207 |   0.001487 | WRITE-OUTPUT-FILE
     0.154 |  15,594,040 |     802 |   0.000192 | READ-IN-CHUNK
     0.021 |      90,736 |  11,315 |   0.000002 | STRING-EQUAL
     0.019 |           0 |   1,557 |   0.000012 | GET-CHUNK-CONTENTS
     0.003 |      20,160 |     802 |   0.000004 | MAKE-NEW-CHUNK
     0.001 |      14,880 |     800 |   0.000002 | ADD-TO-CHUNK-CONTENTS
     0.000 |   2,523,816 |     207 |   0.000000 | LOAD-PAMPHLET-FILE
     0.000 |   2,074,864 |  10,731 |   0.000000 |
CHECK-FOR-TOPLEVEL-CHUNK
-----------------------------------------------------------
    10.856 | 192,594,280 | 368,045 |            | Total

estimated total profiling overhead: 0.77 seconds
overhead estimation parameters:
  2.4e-8s/call, 2.088e-6s total profiling, 9.44e-7s internal profiling
; No value

Ah - concatenate by itself takes about as long as noweb does to do the
complete job.  search and read-line together take less than 3s,  and
are called a combined total of over 300k times to concatenate's less
than 12k.  This seems to suggest read-line, while slower than
read-sequence, is not yet the limiting factor.

Unfortunately, unless someone knows of a library that implements the
type of system Waldek is talking about, it looks like there is some
work ahead to implement the best of all possible notangles.  I may go
ahead and use the less efficient one for the moment to try other things
like teaching asdf to read pamphlet files, and when we are ready to go
"live" we can try for high end optimization targets.  Waldek (if you
made it this far), how much work do you think is involved with setting
up the full polished system?  Also, can you recommend some background
reading on the technique you're describing?

Cheers, and thanks!

CY





 
____________________________________________________________________________________
Don't get soaked.  Take a quick peak at the forecast
with the Yahoo! Search weather shortcut.
http://tools.search.yahoo.com/shortcuts/#loc_weather




reply via email to

[Prev in Thread] Current Thread [Next in Thread]