Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 600EF200B71 for ; Wed, 31 Aug 2016 17:27:23 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 5E650160AA7; Wed, 31 Aug 2016 15:27:23 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A4857160AB5 for ; Wed, 31 Aug 2016 17:27:22 +0200 (CEST) Received: (qmail 80357 invoked by uid 500); 31 Aug 2016 15:27:21 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 80288 invoked by uid 99); 31 Aug 2016 15:27:21 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 31 Aug 2016 15:27:21 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id E84BF2C1B80 for ; Wed, 31 Aug 2016 15:27:20 +0000 (UTC) Date: Wed, 31 Aug 2016 15:27:20 +0000 (UTC) From: "Tim Owen (JIRA)" To: dev@lucene.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (SOLR-9389) HDFS Transaction logs stay open for writes which leaks Xceivers MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 31 Aug 2016 15:27:23 -0000 [ https://issues.apache.org/jira/browse/SOLR-9389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15452531#comment-15452531 ] Tim Owen commented on SOLR-9389: -------------------------------- We're using Solr 6.1 (on local disk now, as mentioned). The first production cluster we had hoped to get stable was 40 boxes, each running 5 or 6 Solr JVMs, with a dedicated ZK cluster on 3 other boxes, and 100 shards per collection. That was problematic, we had a lot of Zookeeper traffic during normal writes, but especially whenever one or more boxes were deliberately killed as many Solr instances restarted all at once, leading to a large overseer queue and shards in recovery for a long time. Right now we're testing two scaled-down clusters: 24 boxes, and 12 boxes, with correspondingly reduced number of shards, to see at what point it can be stable when we do destructive testing by killing machines and whole racks, to see how it copes. 12 boxes is looking a lot more stable so far. We'll have to consider running multiple of these smaller clusters instead of 1 large one - is that best practice? There was some discussion on SOLR-5872 and SOLR-5475 about scaling the overseer with large numbers of collections and shards, although it's clearly a tricky problem. > HDFS Transaction logs stay open for writes which leaks Xceivers > --------------------------------------------------------------- > > Key: SOLR-9389 > URL: https://issues.apache.org/jira/browse/SOLR-9389 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Hadoop Integration, hdfs > Affects Versions: 6.1, master (7.0) > Reporter: Tim Owen > Assignee: Mark Miller > Fix For: master (7.0), 6.3 > > Attachments: SOLR-9389.patch > > > The HdfsTransactionLog implementation keeps a Hadoop FSDataOutputStream open for its whole lifetime, which consumes two threads on the HDFS data node server (dataXceiver and packetresponder) even once the Solr tlog has finished being written to. > This means for a cluster with many indexes on HDFS, the number of Xceivers can keep growing and eventually hit the limit of 4096 on the data nodes. It's especially likely for indexes that have low write rates, because Solr keeps enough tlogs around to contain 100 documents (up to a limit of 10 tlogs). There's also the issue that attempting to write to a finished tlog would be a major bug, so closing it for writes helps catch that. > Our cluster during testing had 100+ collections with 100 shards each, spread across 8 boxes (each running 4 solr nodes and 1 hdfs data node) and with 3x replication for the tlog files, this meant we hit the xceiver limit fairly easily and had to use the attached patch to ensure tlogs were closed for writes once finished. > The patch introduces an extra lifecycle state for the tlog, so it can be closed for writes and free up the HDFS resources, while still being available for reading. I've tried to make it as unobtrusive as I could, but there's probably a better way. I have not changed the behaviour of the local disk tlog implementation, because it only consumes a file descriptor regardless of read or write. > nb We have decided not to use Solr-on-HDFS now, we're using local disk (for various reasons). So I don't have a HDFS cluster to do further testing on this, I'm just contributing the patch which worked for us. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org