[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Axiom-developer] topic modeling
From: |
root |
Subject: |
[Axiom-developer] topic modeling |
Date: |
Sat, 29 Jul 2006 13:59:22 -0400 |
One of the long term goals I've been thinking about is the
petamachine problem. Given a machine with a THz of cpu, a
TByte of memory, a Petabyte of storage, and an OC5 data link
how would you use it to improve computational math research?
One of the subideas is that all of the mathematics that has
ever been published would probably fit on a few terabytes of
disk space. And any new mathematics would be available in a
streaming electronic form. Part of the cpu would be dedicated
to watching the newly arriving stream and classifying the
information in ways that I personally find useful.
I've muttered about expanding the latex tags to have \idea,
\concept, etc. so that newly born papers could be more easily
classified. However, a new technology seems to make that less
interesting. It's called topic modeling. See
http://blogs.zdnet.com/emergingtech?p=304
Given a continuous process that scans incoming papers you
could add them to a semantic network which models information
that I find useful and in ways that I find useful.
Axiom could strongly benefit from such technology if we could
find a good source of technical papers. Starting from an initial
source we could collect electronic conference papers and classify
them. Then Axiom could just look up a concept like "Groebner Basis",
follow it to "Homological Algebra" then onto "Computing P-modules"
and then onto finding an algorithm for computing a presentation
of a finitely generated P-module. Ideally the paper would be
literate and contain code that was automatically incorporated
into the system.
The Crystal idea has a facet that constantly watches the user
interaction, maintians an "intensional stance" of the user,
and tries to find related, relevant work. This technology would
be ideal behind such a facet.
Automatic classification algorithms are always more effective
in limited domains (e.g. math) than in unspecified domains.
Sounds like an NSF or INRIA grant idea to me.
Tim
- [Axiom-developer] topic modeling,
root <=