Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7C16B17A49 for ; Wed, 9 Dec 2015 16:14:18 +0000 (UTC) Received: (qmail 43125 invoked by uid 500); 9 Dec 2015 16:14:05 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 43054 invoked by uid 500); 9 Dec 2015 16:14:05 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 43043 invoked by uid 99); 9 Dec 2015 16:14:04 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Dec 2015 16:14:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 45D19C5DC2 for ; Wed, 9 Dec 2015 16:14:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 5.304 X-Spam-Level: ***** X-Spam-Status: No, score=5.304 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=3, KAM_LAZY_DOMAIN_SECURITY=1, T_RP_MATCHES_RCVD=-0.01, URIBL_BLOCKED=0.001, URI_HEX=1.313] autolearn=disabled Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id OZ-BUwP2NgQt for ; Wed, 9 Dec 2015 16:13:53 +0000 (UTC) Received: from wunderwood.org (wunderwood.org [192.220.101.25]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTP id 6D03F42BBD for ; Wed, 9 Dec 2015 16:13:53 +0000 (UTC) Received: (qmail 11056 invoked by uid 25881); 9 Dec 2015 16:07:12 -0000 Received: from unknown (HELO [192.168.1.90]) (wunder@[76.218.104.79]) (envelope-sender ) by 192.220.101.25 (qmail-ldap-1.03) with AES256-SHA encrypted SMTP for ; 9 Dec 2015 16:07:12 -0000 From: Walter Underwood Content-Type: multipart/alternative; boundary="Apple-Mail=_6C13AC8B-E549-4E23-93C7-C975912F1498" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: Unstructured/Structured data for indexing Date: Wed, 9 Dec 2015 08:05:09 -0800 References: <1449648575366-4244406.post@n3.nabble.com> To: solr-user@lucene.apache.org In-Reply-To: X-Mailer: Apple Mail (2.2104) --Apple-Mail=_6C13AC8B-E549-4E23-93C7-C975912F1498 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Often Solr documents are =E2=80=9Csemi-structured=E2=80=9D. They have = some structured fields and some free-text fields. e-mail messages are = like that, with structured headers and an unstructured body. wunder Walter Underwood wunder@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Dec 9, 2015, at 4:13 AM, Alexandre Rafalovitch = wrote: >=20 > Don't think about indexing so much, think about searching. >=20 > Say you are searching a video? What does that mean? Do you want to > match random sequence of binary values that represent inter-frame > change? Probably not. When you answer what you want to actually search > (title? length? subscripts?), you will discover that structure. What > do you want to return? A whole video, a segment, a description with a > link? >=20 > So, you pre-process/index your data to give you the things you want to > search for and in the form you want them to receive. >=20 > Regards, > Alex. > ---- > Newsletter and resources for Solr beginners and intermediates: > http://www.solr-start.com/ >=20 >=20 > On 9 December 2015 at 03:09, subinalex wrote: >> Hi, >>=20 >> I am a solr newbie,just got a quick question. >>=20 >> SOLR is designed for querying unstructured data,but then why we have = to send >> it in a structured form(json,xml) for indexing?. >>=20 >> Thanks & Regards,S >> Subin >>=20 >>=20 >>=20 >> -- >> View this message in context: = http://lucene.472066.n3.nabble.com/Unstructured-Structured-data-for-indexi= ng-tp4244406.html >> Sent from the Solr - User mailing list archive at Nabble.com. --Apple-Mail=_6C13AC8B-E549-4E23-93C7-C975912F1498--