[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Koha-devel] Building zebradb

From: Tümer Garip
Subject: RE: [Koha-devel] Building zebradb
Date: Thu, 16 Mar 2006 16:33:31 +0200

Hi Paul,
The script you have requested I have posted them tou you directly as I
donno whether the list accepts attachments.

Regarding your questions below:
1- Well yes I am feeding zebra with 2 different kinds of records
(iso2079 and xml) and I do not see any problem with this as ZEBRA
changes everything to its own format anyway. This way I can create a
100K+ records in around 5 minutes in ZEBRA (this time excludes the time
it takes to export my whole database whisch is around 15 min). I can use
ZOOM with XML to update - delete or add new records.

2-Regarding whether we can use perl-ZOOM with iso2079 records it seems
at the moment NO!. And it is all our wish that indexdata does
incorporate this facility for us at some stage.

Best of luck,


-----Original Message-----
From: Paul POULAIN [mailto:address@hidden 
Sent: Wednesday, March 15, 2006 7:19 PM
To: Tümer Garip
Cc: address@hidden; address@hidden
Subject: Re: [Koha-devel] Building zebradb

Tümer Garip a écrit :
> Hi,

Hello Tümer,

> We have now put the zebra into production level systems. So here is 
> some experience to share. Building the zebra database from single 
> records is a veeeeery looong process. (100K records 150k items)
> Best method we found:
> 1- Change zebra.cfg file to include
> iso2079.recordType:grs.marcxml.collection
> recordType:grs.xml.collection
if I understand, you now have 2 types of records in your DB (or 2 
differents representations of a record)

> 2- Write (or hack export.pl) to export all the marc records as one big

> chunk to the correct directory with an extension .iso2079 And system 
> call "zebraidx -g iso2079 -d <dbnamehere> update records -n".

Could you send us the code for export.pl ?

> This ensures that zebra knows its reading marc records rather than xml

> and builds 100K+ records in zooming speed. Your zoom module always 
> uses the grs.xml filter while you can anytime update or reindex any 
> big chunk of the database as long as you have marc records.

Great, I think I understand.

> 3-We are still using the old API weso  read the xml and use 
> MARC::Record->new_from_xml( $xmldata ) A note here that we did not had

> to upgrade MARC::Record or MARC::Charset at all. Any marc created 
> within KOHA is UTF8 and any marc imported into KOHA (old 
> marc_subfield_tables) was correctly decoded to utf8 with char_decode 
> of biblio.

Could it be possible to use this zebra.cfg to manage iso2709 through 
Perl-ZOOM ?
If yes, we could avoid marc => xml => zoom and zoom => xml => marc 

> 4- We modified circ2.pm and items table to have item onloan field and 
> mapped it to marc holdings data. Now our opac search do not call mysql

> but for the branchname.

Could you send us/me the code too ?

> 5- Average updates per day is about 2000 (circulation+cataloger). I 
> can say that the speed of the zoom search which slows down during a 
> commit operation is acceptable considering the speed gain we have on 
> the search.
> 6- Zebra behaves very well with searches but is very tempremental with

> updates. A queue of updates sometimes crashes the zebraserver. When 
> the database crash we can not save anything even though we are using 
> shadow files. I'll be reporting on this issue once we can isolate the 
> problems.

You're definetly a gem too ;-)

Paul POULAIN et Henri Damien LAURENT
Consultants indépendants
en logiciels libres et bibliothéconomie (http://www.koha-fr.org)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]