lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Glen Newton <glen.new...@gmail.com>
Subject Re: Index one huge text file
Date Fri, 22 Jul 2011 15:10:01 GMT
So to use Lucene-speak, each sentence is a document.

I don't know how you are indexing and what code you are using (and
what hardware, etc.), but you if you are not already, should consider
multi-threading the indexing which should give you a significant
indexing performance boost.

-Glen


On Fri, Jul 22, 2011 at 11:04 AM, starz10de <farag_ahmed@yahoo.com> wrote:
> I am interested to search in sentence level.
> It is a parallel corpora , each sentence in the first language is
> equivalence to sentence in the second language. I want to index each
> sentence and have some id for each sentence in order when I retrieve it I go
> easily and retrieve its equivalence in the second language.
>
> This I did by splitting the file and consider each sentence as text file.
> However, this really takes long time to do for many huge text files.
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Index-one-huge-text-file-tp3191605p3191628.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>



-- 

-

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message