Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 63150200BF4 for ; Fri, 6 Jan 2017 14:00:00 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 61BD4160B1F; Fri, 6 Jan 2017 13:00:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id B013E160B49 for ; Fri, 6 Jan 2017 13:59:59 +0100 (CET) Received: (qmail 27316 invoked by uid 500); 6 Jan 2017 12:59:58 -0000 Mailing-List: contact dev-help@manifoldcf.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@manifoldcf.apache.org Delivered-To: mailing list dev@manifoldcf.apache.org Received: (qmail 27256 invoked by uid 99); 6 Jan 2017 12:59:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Jan 2017 12:59:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 75DCA2C03DE for ; Fri, 6 Jan 2017 12:59:58 +0000 (UTC) Date: Fri, 6 Jan 2017 12:59:58 +0000 (UTC) From: "Karl Wright (JIRA)" To: dev@manifoldcf.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CONNECTORS-1364) Better bin naming in the Shared Drive Connector MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 06 Jan 2017 13:00:00 -0000 [ https://issues.apache.org/jira/browse/CONNECTORS-1364?page=3Dcom.atla= ssian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId= =3D15804499#comment-15804499 ]=20 Karl Wright commented on CONNECTORS-1364: ----------------------------------------- I think it's reasonable to implement the workaround as described. Are you planning to come up with a patch? > Better bin naming in the Shared Drive Connector > ----------------------------------------------- > > Key: CONNECTORS-1364 > URL: https://issues.apache.org/jira/browse/CONNECTORS-136= 4 > Project: ManifoldCF > Issue Type: Improvement > Components: JCIFS connector > Affects Versions: ManifoldCF 1.9 > Reporter: Aeham Abushwashi > > Hello and happy new year! > Bin naming in the Shared Drive Connector makes assumptions that are not a= lways valid.=20 > As I understand it, Manifold uses bins to prevent overloading data source= s. In the SDC, server name is designated as bin name. All jobs created agai= nst a particular server will be treated as one unit when documents are prio= ritised, which can severely disadvantage some jobs (e.g. late starters).=20 > Moreover, this is incompatible with some common enterprise server topolog= ies. In Windows DFS, which is widely used in large enterprises, what the SD= C thinks of as a server name, isn=E2=80=99t actually a physical resource. I= t=E2=80=99s a namespace that can span many servers and shares. In this case= , it doesn=E2=80=99t make sense to throttle simply on the root =E2=80=98ser= ver=E2=80=99 name. In other environments, a powerful storage server can be = more than capable of handling high crawl load; overzealous throttling can e= nd up limiting/hurting Manifold=E2=80=99s performance there. > I=E2=80=99m struggling to find a single solution that fits all so I=E2=80= =99m leaning towards passing in to the repo connection config some sort of = server topology flag or throttling depth flag as a hint that ShareDriveConn= ector#getBinNames can use to decide whether the bin name should be server, = server+share or server+share+root_folder. Share and root_folder would need = to be explicitly passed in the repo config too or extracted from the docume= ntIdentifier arg in getBinNames (assuming it's reliable). > Thoughts? -- This message was sent by Atlassian JIRA (v6.3.4#6332)