lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hui Liu <>
Subject Questions regarding re-index when using Solr as a data source
Date Thu, 09 Jun 2016 15:50:11 GMT

              We are porting an application currently hosted in Oracle 11g to Solr Cloud 6.x,
i.e we plan to migrate all tables in Oracle as collections in Solr, index them, and build
search tools on top of this; the goal is we won't be using Oracle at all after this has been
implemented; every fields in Solr will have 'stored=true' and selectively a subset of searchable
fields will have 'indexed=true'; the question is what steps we should follow if we need to
re-index a collection after making some schema changes - mostly we only add new fields to
store, or make a non-indexed field as indexed, we normally do not delete or rename any existing
fields; according to this url: it seems we need
to setup a 'intermediate' Solr1 to only store the data themselves without any indexing, then
have another Solr2 setup to store the indexed data, and in case of re-index, just delete all
the documents in Solr2 for the collection and re-import data from Solr1 into Solr2 using SolrEntityProcessor
(from dataimport handler)? Is this still the recommended approach? I can see the downside
of this approach is if we have tremendous amount of data for a collection (some of our collection
could have several billions of documents), re-import it from Solr1 to Solr2 may take a few
hours or even days, and during this time, users cannot query the data, is there any better
way to do this and avoid this type of down time? Any feedback is appreciated!

Hui Liu
Opentext, Inc.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message