Return-Path: Delivered-To: apmail-jakarta-lucene-dev-archive@apache.org Received: (qmail 71612 invoked from network); 9 Sep 2002 23:36:49 -0000 Received: from unknown (HELO nagoya.betaversion.org) (192.18.49.131) by daedalus.apache.org with SMTP; 9 Sep 2002 23:36:49 -0000 Received: (qmail 22790 invoked by uid 97); 9 Sep 2002 23:37:28 -0000 Delivered-To: qmlist-jakarta-archive-lucene-dev@jakarta.apache.org Received: (qmail 22774 invoked by uid 97); 9 Sep 2002 23:37:27 -0000 Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Developers List" Reply-To: "Lucene Developers List" Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 22762 invoked by uid 98); 9 Sep 2002 23:37:26 -0000 X-Antivirus: nagoya (v4218 created Aug 14 2002) Message-ID: <3D7D3057.2090809@seznam.cz> Date: Tue, 10 Sep 2002 01:35:51 +0200 From: Leo Galambos User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.9) Gecko/20020513 X-Accept-Language: en-us, en MIME-Version: 1.0 To: lucene-dev Subject: Another engine Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Hi. I like Lucene (THAT'S RIGHT!), but it doesn't offer me all features I want. That's why I decided to write another JAVA engine. If the features (see below) are interested for you, and you are a developer, that would like to help me with the new engine, PLEASE let me know (use my private mail, I DO NOT WANT TO START A FLAMEWAR HERE, LARBIN IS COOL. Howgh). Thank you. I would like to contribute to Lucene project, but I have chosen different object model for the new engine... :-( Demo runs here: http://somis4.ais.dundee.ac.uk/sheeef/index.jsp (the machine indexes *.ac.uk right now, so the speed may be slower if you try many concurrent queries). Features: - extended Boolean model with p-metrics - index compression via Golomb, Elias-Gamma, and block coding. Better than Lucene for more than 20-50%. Each inverted list is stored in the best coding method. The method is selected by "inverted list metadata" object - it is not hard-coded. - highly configurable dynamization algorithm - it guarantees a good response time for query(), insert(), delete() operations (without degradation of index structure) - universal stemming technique for almost any language (not used in demo) - on distributed architecture, insert() would not lock the index - the engine would be able to simulate Harvest structure of Brokers - ... Speed (indexing 2000 HTML documents, without stemming) Larbin-latest: 1'17" the engine: 1'22" [RH73,IBMJDK131+JIT] Regards, Leo -- To unsubscribe, e-mail: For additional commands, e-mail: