Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C047511591 for ; Fri, 12 Sep 2014 12:13:42 +0000 (UTC) Received: (qmail 76146 invoked by uid 500); 12 Sep 2014 12:13:32 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 76082 invoked by uid 500); 12 Sep 2014 12:13:32 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 76070 invoked by uid 99); 12 Sep 2014 12:13:31 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Sep 2014 12:13:31 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of arafalov@gmail.com designates 209.85.220.48 as permitted sender) Received: from [209.85.220.48] (HELO mail-pa0-f48.google.com) (209.85.220.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Sep 2014 12:13:27 +0000 Received: by mail-pa0-f48.google.com with SMTP id hz1so1146446pad.7 for ; Fri, 12 Sep 2014 05:13:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=9wCpg5snOZTCCwbLhQ3+MeFcLvYhzfi04i90Hgo6zG0=; b=h0rYFn2OL0sREZY/ZRlysxjkoGyjBxDJCsLc9PiJ6kywjKfJ74Jw/cHyXcT6ascQND O/Hcf1W7AGcTZHxozNFNjpnJgUKsNKAkz/ao9MvzCgD6e/35giaqSE/dwuQyQzVPuBGb krk9ZzG1YFHXOI3Aoe2Ge6+SjMBIvtOgvjAom6LPuiiuYW76mMEyam0LOs1dCfQw+ETD teOpdHQsdoZaUtzNQh/lBzXyVDagE14ZKzNtQiea9svUUCunEzfoEqhqxbMW66s6MGD3 4hvoFq7gyBjKpcb0xNjWgVF8wjUXU6jxuE2mcPtSSuyvMerSVL1HaeOJ5ayL+FuOULhQ rMUQ== X-Received: by 10.70.128.137 with SMTP id no9mr12549601pdb.143.1410523987179; Fri, 12 Sep 2014 05:13:07 -0700 (PDT) MIME-Version: 1.0 Received: by 10.66.86.38 with HTTP; Fri, 12 Sep 2014 05:12:27 -0700 (PDT) In-Reply-To: References: From: Alexandre Rafalovitch Date: Fri, 12 Sep 2014 08:12:27 -0400 Message-ID: Subject: Re: SolrJ : fieldcontent from (multiple) file(s) To: solr-user Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Do you just care about document content? Not metadata, such as file name, date, author, etc? Does it have to be push into Solr or can be pull? If pull, DataImportHandler should be able to do what you want with nested entities design. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=3D6713853 On 12 September 2014 06:53, Clemens Wyss DEV wrote: > Looks like I haven't finished " I know" > I know I could extract the content on our server's side, but I'd really l= ike to take that burden of it. > That said: > Can I hand in the path-to-the-file in a "specific field" which would yiel= d an extraction in Solr? > > -----Urspr=C3=BCngliche Nachricht----- > Von: Clemens Wyss DEV [mailto:clemensdev@mysign.ch] > Gesendet: Freitag, 12. September 2014 11:30 > An: 'solr-user@lucene.apache.org' > Betreff: SolrJ : fieldcontent from (multiple) file(s) > > First of all I'd like to say hello to the Solr world/community ;) So far= we have been using Lucene as-is and now intend to go for Solr. > > Say I have a document which in one field should have the content of a fi= le (indexed only, not stored), in order to make the document searchable due= to the file's content. I know > > How is this achieved using SolrJ, i.e. how do I hand in this document? > > Thx > Clemens >