[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [ANN] guile-snowball-stemmer 0.1.0
From: |
amirouche |
Subject: |
Re: [ANN] guile-snowball-stemmer 0.1.0 |
Date: |
Tue, 07 May 2019 22:36:21 +0200 |
User-agent: |
Roundcube Webmail/1.3.8 |
On 2019-05-07 20:30, address@hidden wrote:
On 2019-05-07 15:28, address@hidden wrote:
I am pleased to announce the immediate availability of
guile-snowball-stemmer.
I made (yet another toy) search engine. It is a small command
line tool that I attach to this mail. The code can be found at:
https://git.sr.ht/~amz3/guile-gotofish
Here is an example run:
$ mkdir ~/.gotofish # Database is stored there
$ guile -L . gotofish.scm search gnu guile # Nothing yet!
# Let'index a couple of articles
$ curl https://en.wikipedia.org/wiki/GNU_Guile | html2text | guile -L .
gotofish.scm index "GNU Guile"
Done!
$ curl https://en.wikipedia.org/wiki/Scheme_%28programming_language%29
| html2text | guile -L . gotofish.scm index "Scheme"
Done!
$ curl https://en.wikipedia.org/wiki/GNU | html2text | guile -L .
gotofish.scm index "GNU"
Done!
$ curl https://en.wikipedia.org/wiki/Tf%E2%80%93idf | html2text | guile
-L . gotofish.scm index "tf-idf"
Done!
# Let's search
$ guile -L . gotofish.scm search gnu guile
** Scheme
** GNU Guile
$ guile -L . gotofish.scm search gnu
** GNU
** GNU Guile
** Scheme
$ guile -L . gotofish.scm search science
** GNU
** GNU Guile
** Scheme
$ guile -L . gotofish.scm search retrieval
# Even if the exact word "retrieval" is not in those pages,
# "retrieved" has the same stem as "retrieval" so all are
# matches
** GNU
** tf-idf
** GNU Guile
** Scheme
$ guile -L . gotofish.scm search idf
** tf-idf
Also one can use multiple words to do a lookup.
This is very primitive but hopefully it will help get going
tomorrow to build my great app!
gotofish.scm
Description: Text document
README.md
Description: Text document