lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 5ton3 <>
Subject The exact same query gets executed n times for the nth row when retrieving body (plaintext) from BLOB column with Tika Entity Processor
Date Fri, 31 Oct 2014 12:11:02 GMT

Not sure if this is a problem or if I just don't understand the debug
response, but it seems somewhat odd to me.
The "main" entity can have multiple BLOB documents. I'm using Tika Entity
Processor to retrieve the body (plaintext) from these documents and put the
result in a multivalued field, "filedata".  The data-config looks like this:

It seems to work properly, but when I debug the data import, it seems that
the query on TABLE2 on the BLOB column ("FILEDATA_BIN") gets executed 1 time
for document #1, which is correct, but 2 times for document #2, 3 times for
document #3, and so on.
I.e. for document #1:

And for document #2:

The result seems correct, ie. it doesn't duplicate the filedata. But why
does it query the DB two times for document #2? Any ideas? Maybe something
wrong in my config?

View this message in context:
Sent from the Solr - User mailing list archive at

View raw message