lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Noble Paul നോബിള്‍ नोब्ळ्" <noble.p...@gmail.com>
Subject Re: Large Data Set Suggestions
Date Thu, 06 Nov 2008 03:35:17 GMT
The performance of DIH is likely to be faster than SolrJ. Because , it
does not have the overhead of an http request.
What is your data source? I am assuming it is xml. SolrJ cannot
directly index xml . You may need to read docs from xml before solrj
can index it.



--Noble

On Wed, Nov 5, 2008 at 9:22 PM, Steven Anderson <sanderson@vsticorp.com> wrote:
> Greetings!
>
> I've been asked to do some indexing performance testing on Solr 1.3
> using large XML document data sets (10M-60M docs) with DIH versus SolrJ.
>
>
> Does anyone have any suggestions where I might find a good data set this
> size?
>
> I saw the wikipedia dump reference in the DIH wiki, but that is only in
> the 7M+ doc range.
>
> Any suggestions would be greatly appreciated.
>
> Thanks,
>
> Steve
>
>
>



-- 
--Noble Paul

Mime
View raw message