lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Rutherglen <jason.rutherg...@gmail.com>
Subject Re: overwrite=false support with SolrJ client
Date Fri, 04 Nov 2011 17:32:04 GMT
It should be supported in SolrJ, I'm surprised it's been lopped out.
Bulk indexing is extremely common.

On Fri, Nov 4, 2011 at 1:16 PM, Ken Krugler <kkrugler_lists@transpac.com> wrote:
> Hi list,
>
> I'm working on improving the performance of the Solr scheme for Cascading.
>
> This supports generating a Solr index as the output of a Hadoop job. We use SolrJ to
write the index locally (via EmbeddedSolrServer).
>
> There are mentions of using overwrite=false with the CSV request handler, as a way of
improving performance.
>
> I see that https://issues.apache.org/jira/browse/SOLR-653 removed this support from SolrJ,
because it was deemed too dangerous for mere mortals.
>
> My question is whether anyone knows just how much performance boost this really provides.
>
> For Hadoop-based workflows, it's straightforward to ensure that the unique key field
is really unique, thus if the performance gain is significant, I might look into figuring
out some way (with a trigger lock) of re-enabling this support in SolrJ.
>
> Thanks,
>
> -- Ken
>
> --------------------------
> Ken Krugler
> http://www.scaleunlimited.com
> custom big data solutions & training
> Hadoop, Cascading, Mahout & Solr
>
>
>
>
>

Mime
View raw message