From dev-return-19116-archive-asf-public=cust-asf.ponee.io@manifoldcf.apache.org Mon Jan 28 09:14:12 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id F261318060E for ; Mon, 28 Jan 2019 09:14:11 +0100 (CET) Received: (qmail 35340 invoked by uid 500); 28 Jan 2019 08:14:11 -0000 Mailing-List: contact dev-help@manifoldcf.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@manifoldcf.apache.org Delivered-To: mailing list dev@manifoldcf.apache.org Received: (qmail 35324 invoked by uid 99); 28 Jan 2019 08:14:10 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Jan 2019 08:14:10 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 6FE31C2779 for ; Mon, 28 Jan 2019 08:14:10 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -110.301 X-Spam-Level: X-Spam-Status: No, score=-110.301 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id dn6y8HhdKLGh for ; Mon, 28 Jan 2019 08:14:09 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 0F10C623BF for ; Mon, 28 Jan 2019 08:04:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 54546E00CA for ; Mon, 28 Jan 2019 08:04:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 14B412437E for ; Mon, 28 Jan 2019 08:04:00 +0000 (UTC) Date: Mon, 28 Jan 2019 08:04:00 +0000 (UTC) From: "balaji (JIRA)" To: dev@manifoldcf.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (CONNECTORS-1574) Performance tuning of manifold MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 balaji created CONNECTORS-1574: ---------------------------------- Summary: Performance tuning of manifold Key: CONNECTORS-1574 URL: https://issues.apache.org/jira/browse/CONNECTORS-1574 Project: ManifoldCF Issue Type: Bug Components: File system connector, JCIFS connector, Solr 6.x component Affects Versions: ManifoldCF 2.5 Environment: Apache manifold installed in Linux machine Linux version 3.10.0-327.el7.ppc64le Red Hat Enterprise Linux Server release 7.2 (Maipo) Reporter: balaji My team is using *Apache ManifoldCF 2.5 with SOLR Cloud* for indexing of data. we are currently having 450-500 jobs which needs to run simultaneously. We need to index json data and we are using connector type as *file system* along with *postgres* as backend database. We are facing several issues like 1. Scheduling works for some jobs and doesn't work for other jobs. 2. Some jobs gets completed and some jobs hangs and doesn't get completed. 3. With one job earlier 60000 documents was getting indexed in 15minutes but now even a directory path having 5 documents takes 20 minutes or sometimes doesn't get completed 4. "list all jobs" or "status and job management" page doesn't load sometimes and on seeing the pg_stat_activity we observe that 2 queries are in waiting state state because of which the page doesn't load. so if we kill those queries or restart manifold the issue gets resolved and the page loads properly queries getting stuck: 1. SELECT ID,FAILTIME, FAILCOUNT, SEEDINGVERSION, STATUS FROM JOBS WHERE (STATUS=$1 OR STATUS=$2) FOR UPDATE 2. UPDATE JOBS SET ERRORTEXT=NULL, ENDTIME=NULL, WINDOWEND=NULL, STATUS=$1 WHERE ID=$2 note : We have deployed manifold in *linux*. Our major requirement is scheduling of jobs which will run every 15 minutes Please help us in fine tuning manifold so that it runs smoothly and acts as a robust system. -- This message was sent by Atlassian JIRA (v7.6.3#76005)