lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Noble Paul നോബിള്‍ नोब्ळ् <noble.p...@corp.aol.com>
Subject Re: Use DIH with large xml file
Date Sun, 21 Jun 2009 10:09:49 GMT
DIH can read item by item. did you use stream="true" in the
XPathEntityProcessor ?

On Sun, Jun 21, 2009 at 9:20 AM, Jianbin Dai <djianbin@yahoo.com> wrote:
>
> Can DIH read item by item instead of the whole file before indexing? my biggest file
size is 6GB, larger than the JVM max ram value.
>
>
> --- On Sat, 6/20/09, Erik Hatcher <erik@ehatchersolutions.com> wrote:
>
> > From: Erik Hatcher <erik@ehatchersolutions.com>
> > Subject: Re: Use DIH with large xml file
> > To: solr-user@lucene.apache.org
> > Date: Saturday, June 20, 2009, 6:52 PM
> > How are you configuring DIH to read
> > those files?  It is likely that you'll need at least as
> > much RAM to the JVM as the largest file you're processing,
> > though that depends entirely on how the file is being
> > processed.
> >
> >     Erik
> >
> > On Jun 20, 2009, at 9:23 PM, Jianbin Dai wrote:
> >
> > >
> > > Hi,
> > >
> > > I have about 50GB of data to be indexed each day using
> > DIH. Some of the files are as large as 6GB. I set the JVM
> > Xmx to be 3GB, but the DIH crashes on those big files. Is
> > there any way to handle it?
> > >
> > > Thanks.
> > >
> > > JB
> > >
> > >
> > >
> >
> >
>
>
>
>



--
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com

Mime
View raw message