bgcolor="#ffffff"
text="#000000"
link="#0000ff"
vlink="#000080"
alink="#ff0000"
>
About Zebra powered Harvest Search System
This is a system to collect data from various sources and make them
searchable using a web browser.
This search system is built with following components:
- Harvest
is a flexible system to collect data from different sources (http,
ftp, nntp, local files), summarize their contents and make them
searchable via various fulltext search engines.
Harvest comes with glimpse which is the default indexer and swish.
- Zebra
is a fulltext indexer which follows Z39.50 standard. This standard
seems to be popular among the librarians. Zebra allows to search
in structured fulltext. It supports very powerful search
capability and supports incremental indexing, which makes it easy
to manage large amount of data.
The motivation to create this system was to replace Harvest's default
fulltext indexer glimpse with a GPLed indexer.
This is still work in progress. The modeling of the data isn't
finished yet and there are following uresolved issues:
- Find out why 1,1010 doesn't work.
- Create a query page which enables to use the lower level features
of Z39.50 and other tweakable variables in zquery.pl, but is not
too complicated.
- How do I get the file name of SOIF object? Or more general: How do
I get the name of the data file which has the expression I am
looking for?
Back to Query Page