lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: Problem using Lucene on Ubuntu
Date Mon, 18 Feb 2008 12:56:49 GMT
How are you loading the document into the content variable below?  My  
guess is still that you have different locales on Windows and Ubuntu.

(Btw, sorry about the java-user comment.  I should wake up before  
sending responses.  For some reason I thought the email was sent to  


On Feb 18, 2008, at 7:44 AM, kratoras wrote:

> Actually what i figured out just now is that the problem is on the  
> indexing
> part. A document with a 15MB size is transformed in a 23MB index  
> which is
> not normal since on windows for the same document the index is 3MB.  
> For the
> indexing i use:
> writer = new IndexWriter(index, new GreekAnalyzer(), !index.exists());
> and to add documents:
> doc.add(new
> Field("contents",content,Field.Store.YES,Field.Index.TOKENIZED));
> where "content" is a string with the content of the document. Should i
> convert this string to UTF-8 using getBytes before i write it to the  
> index??
> -- 
> View this message in context:
> Sent from the Lucene - Java Users mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Grant Ingersoll

Lucene Helpful Hints:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message