lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Hastings <hastings.recurs...@gmail.com>
Subject Re: storing large text fields in a database? (instead of inside index)
Date Tue, 20 Feb 2018 15:36:10 GMT
Really depends on what you consider too large, and why the size is a big
issue, since most replication will go at about 100mg/second give or take,
and replicating a 300GB index is only an hour or two.  What i do for this
purpose is store my text in a separate index altogether, and call on that
core for highlighting.  So for my use case, the primary index with no
stored text is around 300GB and replicates as needed, and the full text
indexes with stored text totals around 500GB and are replicating non stop.
All searching goes against the primary index, and for highlighting i call
on the full text indexes that have a stupid simple schema.  This has worked
for me pretty well at least.

On Tue, Feb 20, 2018 at 10:27 AM, Roman Chyla <roman.chyla@gmail.com> wrote:

> Hello,
>
> We have a use case of a very large index (slave-master; for unrelated
> reasons the search cannot work in the cloud mode) - one of the fields is a
> very large text, stored mostly for highlighting. To cut down the index size
> (for purposes of replication/scaling) I thought I could try to save it in a
> database - and not in the index.
>
> Lucene has codecs - one of the methods is for 'stored field', so that seems
> likes a natural path for me.
>
> However, I'd expect somebody else before had a similar problem. I googled
> and couldn't find any solutions. Using the codecs seems really good thing
> for this particular problem, am I missing something? Is there a better way
> to cut down on index size? (besides solr cloud/sharding, compression)
>
> Thank you,
>
>    Roman
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message