From dev-return-19029-archive-asf-public=cust-asf.ponee.io@manifoldcf.apache.org Sat Jan 12 06:09:08 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 4B62B180679 for ; Sat, 12 Jan 2019 06:09:08 +0100 (CET) Received: (qmail 95617 invoked by uid 500); 12 Jan 2019 05:09:07 -0000 Mailing-List: contact dev-help@manifoldcf.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@manifoldcf.apache.org Delivered-To: mailing list dev@manifoldcf.apache.org Received: (qmail 95536 invoked by uid 99); 12 Jan 2019 05:09:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 12 Jan 2019 05:09:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 22BA11807DE for ; Sat, 12 Jan 2019 05:09:06 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.501 X-Spam-Level: X-Spam-Status: No, score=-109.501 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id 39I0VYjrXyEu for ; Sat, 12 Jan 2019 05:09:04 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id C07D15F5FA for ; Sat, 12 Jan 2019 05:09:03 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id A9FE2E2671 for ; Sat, 12 Jan 2019 05:09:02 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 216CC255C7 for ; Sat, 12 Jan 2019 05:09:02 +0000 (UTC) Date: Sat, 12 Jan 2019 05:09:02 +0000 (UTC) From: "Karl Wright (JIRA)" To: dev@manifoldcf.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CONNECTORS-1563) SolrException: org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CONNECTORS-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16741004#comment-16741004 ] Karl Wright commented on CONNECTORS-1563: ----------------------------------------- {quote} I need to pass from manifold one custom field and value which I want to see in Solr index. That is the reason why I used metadata transformer where I can pass the custom field in job - tab metadata adjuster. {quote} Yes, people do that all the time. Just add the Metadata Adjuster any place in your pipeline and have it inject the field value you want. It will be faithfully transmitted to Solr. > SolrException: org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes > ----------------------------------------------------------------------------------------------- > > Key: CONNECTORS-1563 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1563 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector > Reporter: Sneha > Assignee: Karl Wright > Priority: Major > Attachments: managed-schema, solrconfig.xml > > > I am encountering this problem: > I have checked "Use the Extract Update Handler:" param then I am getting an error on Solr i.e. null:org.apache.solr.common.SolrException: org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes > If I ignore tika exception, my documents get indexed but dont have content field on Solr. > I am using Solr 7.3.1 and manifoldCF 2.8.1 > I am using solr cell and hence not configured external tika extractor in manifoldCF pipeline > Please help me with this problem > Thanks in advance -- This message was sent by Atlassian JIRA (v7.6.3#76005)