lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: SolrCloud - analyze doc just once before sending to replicas
Date Tue, 12 Feb 2013 02:58:58 GMT
It'd make it interest ing to put things in the transaction log I suspect,
but I also suspect that's a solvable problem.

So the idea would be that the doc gets analyzed on the node that receives
it? Otherwise all the analysis would go on on the leader....


On Mon, Feb 11, 2013 at 2:14 AM, Otis Gospodnetic <
otis.gospodnetic@gmail.com> wrote:

> I don't have a problem looking to fix with that. It just occurred to me
> that avoiding double/triple work might be nice.
>
> Otis
> --
> http://sematext.com/
> On Feb 11, 2013 2:12 AM, "Mikhail Khludnev" <mkhludnev@griddynamics.com>
> wrote:
>
>> Nope, as far as I know.
>> Even that field type work itself, I suppose it's not a piece of cake to
>> marry it with SolrCloud.
>> How much an additional software complexity and development efforts you'd
>> spend for that CPU gain? Is it really sensible for you? How big your
>> replication factor (or quantor) ?
>>
>>
>> On Mon, Feb 11, 2013 at 11:02 AM, Otis Gospodnetic <
>> otis.gospodnetic@gmail.com> wrote:
>>
>>> Yeah, something like that :)
>>>
>>> Is that used in SolrCloud when sending docs to replicas to avoid all
>>> replicas having to do the exact same analysis?
>>>
>>> Otis
>>> --
>>> http://sematext.com/
>>> On Feb 11, 2013 1:51 AM, "Mikhail Khludnev" <mkhludnev@griddynamics.com>
>>> wrote:
>>>
>>>> Otis,
>>>>
>>>> It reminds me https://issues.apache.org/jira/browse/SOLR-1535 How do
>>>> they match?
>>>>
>>>>
>>>> On Mon, Feb 11, 2013 at 8:23 AM, Otis Gospodnetic <
>>>> otis.gospodnetic@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> When a doc is pushed into SolrCloud it is sent to all shard's replicas
>>>>> for analysis and indexing, right?
>>>>>
>>>>> Wouldn't it make sense to do the analysis just once and send the
>>>>> already analyzed doc over the wire and have the receiving servers for
just
>>>>> indexing (no analysis)?  For index-heavy apps this could be a big CPU
>>>>> saver.  Or is this already being done?
>>>>>
>>>>> Otis
>>>>> --
>>>>> http://sematext.com/
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Sincerely yours
>>>> Mikhail Khludnev
>>>> Principal Engineer,
>>>> Grid Dynamics
>>>>
>>>> <http://www.griddynamics.com>
>>>>  <mkhludnev@griddynamics.com>
>>>>
>>>
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>> Principal Engineer,
>> Grid Dynamics
>>
>> <http://www.griddynamics.com>
>>  <mkhludnev@griddynamics.com>
>>
>

Mime
View raw message