lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Høydahl <jan....@cominvent.com>
Subject Re: A good KV store/plugins to go with Solr
Date Thu, 14 Jun 2018 12:10:58 GMT
You could fetch the data from your application directly :;)
Also, the Streaming expressions has a jdbc() function but then you will need to know what
to query for. It also has a fetch() function which enriches documents with fields from another
collection. It would probably be possible to write a fetchKV() function which per result document
fetches data from external JDBC (or other) source and enriches on the fly.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 5. jun. 2018 kl. 05:38 skrev Erick Erickson <erickerickson@gmail.com>:
> 
> Well, you can always throw more replicas at the problem as well.
> 
> But Andrea's comment is spot on. When Solr stores a field, it
> compresses it. So to fetch the stored info, it has to:
> 1> seek the disk
> 2> decompress at minimum 16K
> 3> assemble the response.
> 
> All the while perhaps causing memory to be consumed, adding to GC
> issues and the like.
> 
> One possibility is implement an doc transformer. See the class
> ValueAugmenterFactory for a model. What that does is, for each doc
> returned in the result set, call the transform method.
> 
> Another approach would be to only index the first, say, 1K characters
> and just return _that_, along with a link for the full doc that you
> get from another store. Or, indeed from Solr itself since that would
> only be one doc at a time. If you put this in as a string type with
> docValues=true you would avoid most of the disk seek/decompression
> issues.
> 
> Best,
> Erick
> 
> On Mon, Jun 4, 2018 at 12:27 PM, Andrea Gazzarini <a.gazzarini@sease.io> wrote:
>> Hi Sam, I have been in a similar scenario (not recently so my answer could
>> be outdated). As far as I remember caching, at least in that scenario,
>> didn't help so much, probably because the field size.
>> 
>> So we went with the second option: a custom SearchComponent connected with
>> Redis. I'm not aware if such component is available somewhere but, trust
>> me, it's a very easy thing to write.
>> 
>> Best,
>> Andrea
>> 
>> On Mon, 4 Jun 2018, 20:45 Sambhav Kothari, <sambhav@metabrainz.org> wrote:
>> 
>>> Hi everyone,
>>> 
>>> We at MetaBrainz are trying to scale our solr cloud instance but are
>>> hitting a bottle-neck.
>>> 
>>> Each of the documents in our solr index is accompanied by a '_store' field
>>> that store our API compatible response for that document (which is
>>> basically parsed and displayed by our custom response writer).
>>> 
>>> The main problem is that this field is very large (It takes upto 60-70% of
>>> our index) and because of this, Solr is struggling to keep up with our
>>> required reqs/s.
>>> 
>>> Any ideas on how to improve upon this?
>>> 
>>> I have a couple of options in mind -
>>> 
>>> 1. Use caches extensively.
>>> 2. Have solr return only a doc id and fetch the response string from a KV
>>> store/fast db.
>>> 
>>> About 2 - are there any solr plugins will allow me to do this?
>>> 
>>> Thanks,
>>> Sam
>>> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message