lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason, Kim" <hialo...@gmail.com>
Subject Re: how can i use solrj binary format for indexing?
Date Fri, 22 Oct 2010 05:36:14 GMT

Hi Gora, I really appreciate.
Your reply was a great help to me. :)
I hope everything is fine with you.

Regards,
Jason




Gora Mohanty-3 wrote:
> 
> On Mon, Oct 18, 2010 at 8:22 PM, Jason, Kim <hialooha@gmail.com> wrote:
> 
> Sorry for the delay in replying. Was caught up in various things this
> week.
> 
>> Thank you for reply, Gora
>>
>> But I still have several questions.
>> Did you use separate index?
>> If so, you indexed 0.7 million Xml files per instance
>> and merged it. Is it Right?
> 
> Yes, that is correct. We sharded the data by user ID, so that each of the
> 25
> cores held approximately 0.7 million out of the 3.5 million records. We
> could
> have used the sharded indices directly for search, but at least for now
> have
> decided to go with a single, merged index.
> 
>> Please let me know how to work multiple instances and cores in your case.
> [...]
> 
> * Multi-core Solr setup is quite easy, via configuration in solr.xml:
>   http://wiki.apache.org/solr/CoreAdmin . The configuration, i.e.,
>   schema, solrconfig.xml, etc. need to be replicated across the
>   cores.
> * Decide which XML files you will post to which core, and do the
>   POST with curl, as usual. You might need to write a little script
>   to do this.
> * After indexing on the cores is done, make sure to do a commit
>   on each.
> * Merge the sharded indexes (if desired) as described here:
>   http://wiki.apache.org/solr/MergingSolrIndexes . One thing to
>   watch out for here is disk space. When merging with Lucene
>   IndexMergeTool, we found that a rough rule of thumb was that
>   intermediate steps in the merge would require about twice as
>   much space as the total size of the indexes to be merged. I.e.,
>   if one is merging 40GB of data in sharded indexes, one should
>   have at least 120GB free.
> 
> Regards,
> Gora
> 
> 

-- 
View this message in context: http://lucene.472066.n3.nabble.com/how-can-i-use-solrj-binary-format-for-indexing-tp1722612p1750669.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message