Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 623B3200BD5 for ; Thu, 8 Dec 2016 18:44:06 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 60CFA160B1F; Thu, 8 Dec 2016 17:44:06 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A9294160B0A for ; Thu, 8 Dec 2016 18:44:05 +0100 (CET) Received: (qmail 23732 invoked by uid 500); 8 Dec 2016 17:43:59 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 23365 invoked by uid 99); 8 Dec 2016 17:43:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Dec 2016 17:43:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 78BEE2C03E5 for ; Thu, 8 Dec 2016 17:43:58 +0000 (UTC) Date: Thu, 8 Dec 2016 17:43:58 +0000 (UTC) From: "Mark Miller (JIRA)" To: dev@lucene.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (SOLR-9824) Documents indexed in bulk are replicated using too many HTTP requests MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 08 Dec 2016 17:44:06 -0000 [ https://issues.apache.org/jira/browse/SOLR-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732865#comment-15732865 ] Mark Miller commented on SOLR-9824: ----------------------------------- Another idea is to look at the request size and use the wait when the size is large enough - streaming and chunked encoding would have no size and wait, small docs or a few docs would not wait, and lots of docs or a really large doc would wait. Given a really large doc will take a while anyway, the additional wait should not be that bad. Just another idea, will keep poking around this. > Documents indexed in bulk are replicated using too many HTTP requests > --------------------------------------------------------------------- > > Key: SOLR-9824 > URL: https://issues.apache.org/jira/browse/SOLR-9824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud > Affects Versions: 6.3 > Reporter: David Smiley > > This takes awhile to explain; bear with me. While working on bulk indexing small documents, I looked at the logs of my SolrCloud nodes. I noticed that shards would see an /update log message every ~6ms which is *way* too much. These are requests from one shard (that isn't a leader/replica for these docs but the recipient from my client) to the target shard leader (no additional replicas). One might ask why I'm not sending docs to the right shard in the first place; I have a reason but it's besides the point -- there's a real Solr perf problem here and this probably applies equally to replicationFactor>1 situations too. I could turn off the logs but that would hide useful stuff, and it's disconcerting to me that so many short-lived HTTP requests are happening, somehow at the bequest of DistributedUpdateProcessor. After lots of analysis and debugging and hair pulling, I finally figured it out. > In SOLR-7333 ([~tpot]) introduced an optimization called {{UpdateRequest.isLastDocInBatch()}} in which ConcurrentUpdateSolrClient will poll with a '0' timeout to the internal queue, so that it can close the connection without it hanging around any longer than needed. This part makes sense to me. Currently the only spot that has the smarts to set this flag is {{JavaBinUpdateRequestCodec.unmarshal.readOuterMostDocIterator()}} at the last document. So if a shard received docs in a javabin stream (but not other formats) one would expect the _last_ document to have this flag. There's even a test. Docs without this flag get the default poll time; for javabin it's 25ms. Okay. > I _suspect_ that if someone used CloudSolrClient or HttpSolrClient to send javabin data in a batch, the intended efficiencies of SOLR-7333 would apply. I didn't try. In my case, I'm using ConcurrentUpdateSolrClient (and BTW DistributedUpdateProcessor uses CUSC too). CUSC uses the RequestWriter (defaulting to javabin) to send each document separately without any leading marker or trailing marker. For the XML format by comparison, there is a leading and trailing marker ( ... ). Since there's no outer container for the javabin unmarshalling to detect the last document, it marks _every_ document as {{req.lastDocInBatch()}}! Ouch! -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org