lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rob Staveley (Tom)" <rstave...@seseit.com>
Subject RE: Problems indexing large documents
Date Sat, 10 Jun 2006 06:21:57 GMT
I'm trying to come to terms with
http://lucene.apache.org/java/docs/api/org/apache/lucene/index/IndexWriter.h
tml#setMaxFieldLength(int) too. I've been attempting to index large text
files as single Lucene documents, passing them as java.io.Reader to cope
with RAM. I was assuming (like - I suspect - manu mohedano) that an unstored
field could be of any length and that maxFieldLength was only applicable to
stored fields. Do we in fact need to break the document into manageable
parts?

-----Original Message-----
From: Pasha Bizhan [mailto:lucene-list@lucenedotnet.com] 
Sent: 09 June 2006 21:35
To: java-user@lucene.apache.org
Subject: RE: Problems indexing large documents

Hi, 

> From: manu mohedano [mailto:manumohedano@gmail.com] 

> Hi All! I have a trouble... When I index text documents in 
> english, there is no problem, buy when I index Spanish text 
> documents (And they're big), a lot of information from the 
> document don't become indexed (I suppose it is due to the 
> Analyzer, but if the documents is less tahn 400kb it works 
> perfectly). Howewer I want to Index ALL the strings in the 
> document with no StopWords. Is this possible??

Read javadoc about  DEFAULT_MAX_FIELD_LENGTH at
http://lucene.apache.org/java/docs/api/org/apache/lucene/index/IndexWriter.h
tml#setMaxFieldLength(int) 

Pasha Bizhan



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message