lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Praveen Peddi" <ppe...@contextmedia.com>
Subject Re: Sorting and tokenization
Date Thu, 01 Jul 2004 14:35:15 GMT
The solution you suggested is exactly as I expected and I already thought
about implementing it. But the problem is the memory in efficiency. Somce
times titles are huge. And with i18n, title can be in japanese, chinese or
any language which takes mroe memory than english.

Ok. how about taking the first token of the title and using it just for the
sake of sorting. Does anyone see any problem with it? This solution saves
atleast some memory, compared to the other solution.

Praveen

----- Original Message ----- 
From: "John Moylan" <johnm@rte.ie>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Sent: Thursday, July 01, 2004 10:24 AM
Subject: Re: Sorting and tokenization


> Hi,
>
> You just need to have another title field that is not tokenized - for
> sorting purposes.
>
> Best,
> John
>
> On Thu, 2004-07-01 at 15:15, Praveen Peddi wrote:
> > Hello all,
> > Now that lucene 1.4 rc3 has sorting functionality built in, I am adding
sorting functionality to our searching. Before posting any question to this
mailing list, I have been going thru most of the email responses in this
mailing list related to sorting. I have found that I cannot tokenize the
fields that I want to sort on.
> >
> > Lets take the example I have.
> > I use lucene 1.3 final for searching. Sorting is in fact a very
important feature in our application. But we found that lucene does not
support out of box, we had to implement sorting by score and doc id
programatically which is kind of useless for us. So I thought lucene's new
sorting feature will best suit now. But unfortunately, the field called
"title" is tokenized currently. And this is done purposefully because users
would want to search partial matches (or rather search on multiple words of
the title). So if we make it un tokenized we may lose an improtant
functionality.
> >
> > My question is, is there any way I can achieve sorting the objects by
title and keeping title as tokenized?
> >
> > Thanks in advance.
> >
> > Praveen
> >
> >
> > **************************************************************
> > Praveen Peddi
> > Sr Software Engg, Context Media, Inc.
> > email:ppeddi@contextmedia.com
> > Tel:  401.854.3475
> > Fax:  401.861.3596
> > web: http://www.contextmedia.com
> > **************************************************************
> > Context Media- "The Leader in Enterprise Content Integration"
> -- 
> John Moylan
> ----------------------
> ePublishing
> Radio Telefis Eireann,
> Montrose House,
> Donnybrook,
> Dublin 4,
> Eire
> t:+353 1 2083564
> e:john.moylan@rte.ie
>
>
>
****************************************************************************
**
> The information in this e-mail is confidential and may be legally
privileged.
> It is intended solely for the addressee. Access to this e-mail by anyone
else
> is unauthorised. If you are not the intended recipient, any disclosure,
> copying, distribution, or any action taken or omitted to be taken in
reliance
> on it, is prohibited and may be unlawful.
> Please note that emails to, from and within RTÉ may be subject to the
Freedom
> of Information Act 1997 and may be liable to disclosure.
>
****************************************************************************
**
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message