lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Per Steffensen (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SOLR-6816) Review SolrCloud Indexing Performance.
Date Wed, 17 Dec 2014 10:17:14 GMT

    [ https://issues.apache.org/jira/browse/SOLR-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249673#comment-14249673
] 

Per Steffensen edited comment on SOLR-6816 at 12/17/14 10:17 AM:
-----------------------------------------------------------------

Those of you that have been following my comments on misc issues will know that I like "separation
of concerns". So in our version of Solr all this decision-making on when to do document-version-check,
when to delete existing documents with same id-value etc is isolated in {{enum UpdateSemanticsMode}}
- see https://issues.apache.org/jira/secure/attachment/12553312/SOLR-3173_3178_3382_3428_plus.patch.
We support different modes that makes slightly different decisions on the above topics, which
is the reason for using an enum. You do not need that, because you only have "one mode", but
that should not prevent you from separating the decision-making concern.

The patch is not entirely up to date with what we do today, but at least it illustrates the
"separation of concerns". {{DistributedUpdateHandler}} deals with a million concerns, so maybe
you want to adopt that idea and move the code making the decisions out of {{DistributedUpdateHandler}}.

Only mention this because I sense that at least [~shalinmangar] agrees that some cleanup (a.o.
of {{DistributedUpdateHandler}}) is required: https://twitter.com/shalinmangar/status/543874893549277184



was (Author: steff1193):
Those of you that have been following my comments on misc issues will know that I like "separation
of concerns". So in our version of Solr all this decision-making on when to do document-version-check,
when to delete existing documents with same id-value etc is isolated in {{enum UpdateSemanticsMode}}
- see https://issues.apache.org/jira/secure/attachment/12553312/SOLR-3173_3178_3382_3428_plus.patch.
We support different modes that makes slightly different decisions on the above topics, which
is the reason for using an enum. You do not need that, because you only have "one mode", but
that should not prevent you from separating the decision-making concern.

The patch is not entirely up to date with what we do today, but at least it illustrates the
"separation of concerns". {{DistributedUpdateHandler}} deals with a million concerns, so maybe
you want to adopt that idea and move the code making the decisions out of {{DistributedUpdateHandler}}.


> Review SolrCloud Indexing Performance.
> --------------------------------------
>
>                 Key: SOLR-6816
>                 URL: https://issues.apache.org/jira/browse/SOLR-6816
>             Project: Solr
>          Issue Type: Task
>          Components: SolrCloud
>            Reporter: Mark Miller
>            Priority: Critical
>         Attachments: SolrBench.pdf
>
>
> We have never really focused on indexing performance, just correctness and low hanging
fruit. We need to vet the performance and try to address any holes.
> Note: A common report is that adding any replication is very slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message