lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steven Anderson" <>
Subject RE: Large Data Set Suggestions
Date Thu, 06 Nov 2008 13:34:22 GMT
> The performance of DIH is likely to be faster than SolrJ. 
> Because , it does not have the overhead of an http request.

Understood.  However, we may not have the option of co-locating the data
to be injested with the Solr server.

> What is your data source? I am assuming it is xml. 

Yes. Incoming stream of xml documents to a directory.

> SolrJ cannot directly index xml . You may need to read docs from xml
before solrj can index it.

Understood.  We'd like to compare the performance difference of DIH vs.
custom xml parsing + SolrJ.

A. Steven Anderson
410-418-9908 VSTI
443-790-4269 cell

View raw message