lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pulkit Singhal <>
Subject Enabling the right logs for DIH
Date Sat, 01 Oct 2011 22:17:00 GMT
The Problem:
When using DIH with trunk 4.x, I am seeing some very funny numbers
with a particularly large XML file that I'm trying to import. Usually
there are bound to be more rows than documents indexed in DIH because
of the foreach property but my other xm lfiles have maybe 1.5 times
the rows compared to the # of docs indexed.

This particular funky file ends up with something like:
<str name="Total Rows Fetched">25614008</str>
<str name="Total Documents Processed">1048</str>
That's 25 million rows fetched before even a measly 1000 docs are indexed!
Something has to be wrong here.
I checked the xml for well-formed-ness in vim by running "!:xmllint
--noout %" so I think there are no issues there.

The Question:
For those intimately familiar with DIH code/behaviour: What is the
appropriate log-level that will let me see the rows & docs printed out
to log as each one is fetched/created? I don't want to make the logs
explode because then I won't be able to read through them. Is there
some gentle balance here that I can leverage?

- Pulkit

View raw message