Le blog du garambrogne

Looking for New York, the shingle way

wood shingleSearch engine works with words, but some noun is compound by multiple words. New York is a noun, not two words. With a list of noun, search engine can handle it well, and wikipedia can help.

Lire la suite...

Indexing mp3 database with Python and Lucene

MP3 player uses a database for managing thousands of songs. Here is a Python test for indexing the XML dump of common MP3 player (rhytmbox and iTunes), to a Lucene index, via Goniometre, a Passerelle project.

Lire la suite...

Using Compass without dirtying its hands with java

GoniometerCompass is a nice project using Lucene to bring easy search to java project.

But sometimes, we don't need java in a project.

Lire la suite...

A lexicon approach for Lucene full text search engine.

A libraryLucene uses an index to find document from thier words. Storing more informations with each words, ie building a lexicon, can expands Lucene search and helps query refining.

Lire la suite...