lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ali Nazemian <alinazem...@gmail.com>
Subject Re: Solr re-indexing in case of store=false
Date Mon, 09 May 2016 13:14:34 GMT
Dear Erick,
Hi,
Thank you very much. About the storing part you are right, unless the
primary datastore uses some kind of data compression which in my case it
does (I am using Cassandra as a primary datastore), and I am not sure about
Solr that it has any kind of compression or not.
According to your reply, it seems that I have to do that in a hard way.  I
mean using the primary datastore to build the index from scratch.

Sincerely,

On Sun, May 8, 2016 at 11:07 PM, Erick Erickson <erickerickson@gmail.com>
wrote:

> bq: I would be grateful if somebody could introduce other way of
> re-indexing
> the whole data without using another datastore
>
> Not possible currently. Consider what's _in_ the index when stored="false".
> The actual terms are the output of the entire analysis chain, including
> stemming, stopword removal, synonym substitution etc. Since the
> indexing process is lossy, you simply cannot reconstruct the original
> stream from the indexed terms.
>
> I suppose one _could_ do this in the case of docValues only index with
> the new return-values-from-docvalues functionality, but even that's lossy
> because the order of returned values may not be the original insertion
> order. And if that suits your needs, a pretty simple driver program would
> suffice.
>
> To do this from indexed-only terms you'd have to somehow store the
> original version of each term or store some codes indicating exactly
> how to reconstruct the original steam, which very possibly would take
> up as much space as if you'd just stored the values anyway. _And_ it
> would burden every one else who didn't want to do this with a bloated
> index.
>
> Best,
> Erick
>
> On Sun, May 8, 2016 at 4:25 AM, Ali Nazemian <alinazemian@gmail.com>
> wrote:
> > Dear all,
> > Hi,
> > I was wondering, is it possible to re-index Solr 6.0 data in case of
> > store=false? I am using Solr as a secondary datastore, and for the sake
> of
> > space efficiency all the fields (except id) are considered as
> store=false.
> > Currently, due to some changes in application business, Solr schema
> should
> > change, and in order to see the effect of changing schema on old data, I
> > have to do the re-index process.  I know that one way of re-indexing in
> > Solr is reading data from one collection (core) and inserting that to
> > another one, but this solution is not possible for store=false fields,
> and
> > re-indexing the whole data through primary datastore is kind of costly,
> so
> > I would be grateful if somebody could introduce other way of re-indexing
> > the whole data without using another datastore.
> >
> > Sincerely,
> >
> > --
> > A.Nazemian
>



-- 
A.Nazemian

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message