Return-Path: Delivered-To: apmail-jackrabbit-users-archive@locus.apache.org Received: (qmail 34364 invoked from network); 24 Jul 2007 07:56:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 24 Jul 2007 07:56:53 -0000 Received: (qmail 2849 invoked by uid 500); 24 Jul 2007 07:56:53 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 2831 invoked by uid 500); 24 Jul 2007 07:56:53 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 2811 invoked by uid 99); 24 Jul 2007 07:56:53 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Jul 2007 00:56:53 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of jukka.zitting@gmail.com designates 209.85.132.241 as permitted sender) Received: from [209.85.132.241] (HELO an-out-0708.google.com) (209.85.132.241) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Jul 2007 00:56:50 -0700 Received: by an-out-0708.google.com with SMTP id c37so413129anc for ; Tue, 24 Jul 2007 00:56:30 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=d62YZ4WyyymjuSyQY6e5JwQavOrzS/AX7TC6aSH+ygEF8vXOvlpI3OG7XOMpgyQQXS1TE6CJxHEybnilidIY2XbToGgPeVYqFqga4e1WHH3E5cFqQHppCPbX21aUkJLbNeo5aGN8SQzqGDYBGVb+wP4VD3qOY0ruu0AgJ1p5TuU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=nXi8blHCo0y/aBPUyRlVJ7PgiprEoWjRoL1cY34+fPQ+U6gKs7J+MCAKNUbIPQ4FxJ/sXe+QDyC9IACLTYStflwdSCynNe3HV97LJgB0FuHB+zkHpbMxN3hw9xGgzFtEV/TelozPsJzc5wgRJ5m//wfIa/avottklXBsj0hQE8w= Received: by 10.100.125.5 with SMTP id x5mr2144046anc.1185263790227; Tue, 24 Jul 2007 00:56:30 -0700 (PDT) Received: by 10.100.163.1 with HTTP; Tue, 24 Jul 2007 00:56:29 -0700 (PDT) Message-ID: <510143ac0707240056j407d2e86ia74544e994bd4428@mail.gmail.com> Date: Tue, 24 Jul 2007 10:56:29 +0300 From: "Jukka Zitting" To: users@jackrabbit.apache.org Subject: Re: FullText Search In-Reply-To: <11758267.post@talk.nabble.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <11758267.post@talk.nabble.com> X-Virus-Checked: Checked by ClamAV on apache.org Hi, On 7/24/07, Ishai Borovoy wrote: > Is it possible to perform full text search on unstructured node type that > contains binary file/s (e.g.: word,pdf,excel)? See https://issues.apache.org/jira/browse/JCR-729 for a related feature request. Currently Jackrabbit only indexes binary "jcr:data" properties that have a sibling "jcr:mimeType" property that indicates the relevant mime type. There is currently no active effort to implement JCR-729, I guess we will do that once the incubating Tika project (http://incubator.apache.org/tika/) or some other project comes up with a generic library that allows us to avoid having to deal with all the complexities of automatic mime type detection and various different parser libraries. BR, Jukka Zitting