Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3A54318B4F for ; Tue, 30 Jun 2015 16:12:49 +0000 (UTC) Received: (qmail 2646 invoked by uid 500); 30 Jun 2015 16:12:44 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 2574 invoked by uid 500); 30 Jun 2015 16:12:44 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 2558 invoked by uid 99); 30 Jun 2015 16:12:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Jun 2015 16:12:43 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of benedetti.alex85@gmail.com designates 209.85.218.45 as permitted sender) Received: from [209.85.218.45] (HELO mail-oi0-f45.google.com) (209.85.218.45) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Jun 2015 16:10:21 +0000 Received: by oift81 with SMTP id t81so11093358oif.3 for ; Tue, 30 Jun 2015 09:11:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=W860gaDBmi00SH56Clv0tOrF6mM1WvU4hNDKcaam7fg=; b=Ok8fRK1/tJEBcyrtqjigfpYnPPtFQ+sdsA4vXUN91N+GB9nSF5QDddHsgGxcctDPsw /uX5LZS2mQbs5bVJbVOFhv4YriKm+DOAmBzTTABFGFwuega4IOhBkchmIjVsoOihoAc3 kcbUAgsgzyVG5v5OwRCbMZMksct8dlto/cY2DxoGOY74F8H8K2hr0tDkNjuYiKZ8jpeA 5QjN73BuAJp4oCU4JpUpPzcEm74ssBthb+5Zimp8YOKiT65b3o/St1m/JQMhpBQ73pvn FakvVeHurkaDfrLdiIn9oPwPWmMeYhL1gE69I5ufyPR7KLOlWsXA76u8sNIIqwicERSM DsDw== MIME-Version: 1.0 X-Received: by 10.60.135.130 with SMTP id ps2mr20212102oeb.16.1435680683933; Tue, 30 Jun 2015 09:11:23 -0700 (PDT) Received: by 10.202.199.197 with HTTP; Tue, 30 Jun 2015 09:11:23 -0700 (PDT) In-Reply-To: References: Date: Tue, 30 Jun 2015 17:11:23 +0100 Message-ID: Subject: Re: Upgrade to 5.2 from 4.6, no storing of text From: Alessandro Benedetti To: "solr-user@lucene.apache.org" Content-Type: multipart/alternative; boundary=047d7b3a9cb02d44880519be7534 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b3a9cb02d44880519be7534 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable No worries, it is not a big deal you shared the schema.xml, I said that only because it turned the mail a little hard to read, anyway, in my opinion the query is correct, so the problem should reside elsewhere. Can you share the solrconfig.xml piece for your select request handler ? Probably it is not the problem, but can give us more info. I find text to be stored, so highlighting should work. >From official documentation : "The standard highlighter (AKA the default highlighter) doesn't require any special indexing parameters on the fields to highlight. However you can optionally turn on termVectors, termPositions, and termOffsets for any field to be highlighted. This will avoid having to run documents through the analysis chain at query-time and will make highlighting significantly faster and use less memory, particularly for large text fields, and even more so when hl.usePhraseHighlighter is enabled." So you should be ok. Keep us posted 2015-06-30 16:00 GMT+01:00 Mark Ehle : > Alessandro - > > Someone asked to see the schema, I posted it. Should I have just attached > it? Does this mailing list support that? > > I am by no means a SOLR expert. I am a PHP coder who wrote a > (very-much-loved by our library staff and patrons) newspaper indexing too= l > that I am trying to update. I only know enough about SOLR to install it, > and index and query. All I did to the 5.2 schema was add the > newspaper-specific fields that was in the old schema. > > I cannot answer most of your questions. I just know that this url: > > http://127.0.0.1:8080/solr/newspapers/select?q=3D%22JOHN+GRAP%22&fl=3Dyea= r&wt=3Djson&indent=3Dtrue&hl=3Dtrue&hl.fl=3Dtext&hl.simple.pre=3D%3Cem%3E&h= l.simple.post=3D%3C%2Fem%3E > > used to produce snippets of highlited text in 4.6. In 5.2 it does not. > > > Thanks - > > Mark Ehle > Computer Support Librarian > Willard Library > Battle Creek, MI > > > On Tue, Jun 30, 2015 at 10:50 AM, Alessandro Benedetti < > benedetti.alex85@gmail.com> wrote: > > > Instead of your immense schema, can you give us the details of the > > Highlight you are trying to use ? > > And how you are trying to use it ? > > Which client ? Direct APi calls ? > > > > let us know! > > > > Cheers > > > > 2015-06-30 15:10 GMT+01:00 Mark Ehle : > > > > > Thanks to all for the help - it's now storing text and I can search a= nd > > get > > > results just before in 4.6, but I cannot get snippets to appear when = I > > ask > > > for highlighting. > > > > > > > > > when I add documents, here is the URL my script generates: > > > > > > > > > > > > http://localhost:8080/solr/newspapers/update/extract?literal.id=3D2015_01= _01_battlecreekenquirer-004&literal.publication_date=3D2015-01-01T00:00:00Z= &literal.year=3D2015&literal.yearstr=3D2015&literal.day=3D1&literal.month_n= um=3D1&literal.month=3D01_January&literal.publication_name=3DBattle%20Creek= %20Enquirer&literal.publication_type=3Dnewspaper&literal.short_name=3Dbattl= ecreekenquirer&literal.image_number=3D4&literal.filename=3D2015_01_01_battl= ecreekenquirer-004.pdf&literal.copyright_year=3D1923&literal.copyright_rest= ricted=3Dy&fmap.content=3Dpublication_text&stream.contentType=3Dapplication= %2Ftxt&stream.file=3D%2Farchive_data%2Fnewspapers%2FBattle%20Creek%20Enquir= er%2F2015%2F01_January%2F2015_01_01_battlecreekenquirer%2Ftxt%2F2015_01_01_= battlecreekenquirer-004.txt > > > > > > > > > And here is my schema: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > sortMissingLast=3D"true" > > > omitNorms=3D"true"/> > > > > > > > > > > sortMissingLast=3D"true" > > > omitNorms=3D"true"/> > > > > > > > > > > > > > > > > > > > > > > > omitNorms=3D"true" positionIncrementGap=3D"0"/> > > > precisionStep=3D"0" > > > omitNorms=3D"true" positionIncrementGap=3D"0"/> > > > > > omitNorms=3D"true" positionIncrementGap=3D"0"/> > > > > precisionStep=3D"0" > > > omitNorms=3D"true" positionIncrementGap=3D"0"/> > > > > > > > > > > > omitNorms=3D"true" positionIncrementGap=3D"0"/> > > > > precisionStep=3D"8" > > > omitNorms=3D"true" positionIncrementGap=3D"0"/> > > > precisionStep=3D"8" > > > omitNorms=3D"true" positionIncrementGap=3D"0"/> > > > > > precisionStep=3D"8" omitNorms=3D"true" positionIncrementGap=3D"0"/> > > > > > > > > > > > precisionStep=3D"0" positionIncrementGap=3D"0"/> > > > > > > > > > > > precisionStep=3D"6" positionIncrementGap=3D"0"/> > > > > > > > > > > > > > > > > > > > > > > > sortMissingLast=3D"true" omitNorms=3D"true"/> > > > > > sortMissingLast=3D"true" omitNorms=3D"true"/> > > > > > sortMissingLast=3D"true" omitNorms=3D"true"/> > > > > > sortMissingLast=3D"true" omitNorms=3D"true"/> > > > > > > > > > > > > indexed=3D"true" > > /> > > > > > > > > > > > > > > > > > > > > > > > positionIncrementGap=3D"100"> > > > > > > > > > > > > > > > > > > > > > > > positionIncrementGap=3D"100"> > > > > > > > > > > > words=3D"stopwords.txt" enablePositionIncrements=3D"true" /> > > > > > > > > > > > > > > > > > > > > words=3D"stopwords.txt" enablePositionIncrements=3D"true" /> > > > synonyms=3D"synonyms.txt" > > > ignoreCase=3D"true" expand=3D"true"/> > > > > > > > > > > > > > > > > > > > > positionIncrementGap=3D"100"> > > > > > > > > > > > > > > > > > ignoreCase=3D"true" > > > words=3D"stopwords_en.txt" > > > enablePositionIncrements=3D"true" > > > /> > > > > > > > > > > > protected=3D"protwords.txt"/> > > > > > > > > > > > > > > > > > > synonyms=3D"synonyms.txt" > > > ignoreCase=3D"true" expand=3D"true"/> > > > > > ignoreCase=3D"true" > > > words=3D"stopwords_en.txt" > > > enablePositionIncrements=3D"true" > > > /> > > > > > > > > > > > protected=3D"protwords.txt"/> > > > > > > > > > > > > > > > > > > > > > > > positionIncrementGap=3D"100" autoGeneratePhraseQueries=3D"true"> > > > > > > > > > > > > > > > > > ignoreCase=3D"true" > > > words=3D"stopwords_en.txt" > > > enablePositionIncrements=3D"true" > > > /> > > > > > generateWordParts=3D"1" generateNumberParts=3D"1" catenateWords=3D"1" > > > catenateNumbers=3D"1" catenateAll=3D"0" splitOnCaseChange=3D"1"/> > > > > > > > > protected=3D"protwords.txt"/> > > > > > > > > > > > > > > > synonyms=3D"synonyms.txt" > > > ignoreCase=3D"true" expand=3D"true"/> > > > > > ignoreCase=3D"true" > > > words=3D"stopwords_en.txt" > > > enablePositionIncrements=3D"true" > > > /> > > > > > generateWordParts=3D"1" generateNumberParts=3D"1" catenateWords=3D"0" > > > catenateNumbers=3D"0" catenateAll=3D"0" splitOnCaseChange=3D"1"/> > > > > > > > > protected=3D"protwords.txt"/> > > > > > > > > > > > > > > > > > > > > positionIncrementGap=3D"100" autoGeneratePhraseQueries=3D"true"> > > > > > > > > > synonyms=3D"synonyms.txt" > > > ignoreCase=3D"true" expand=3D"false"/> > > > > > words=3D"stopwords_en.txt"/> > > > > > generateWordParts=3D"0" generateNumberParts=3D"0" catenateWords=3D"1" > > > catenateNumbers=3D"1" catenateAll=3D"0"/> > > > > > > > > protected=3D"protwords.txt"/> > > > > > > > > > > > > > > > > > > > > > > > > > > positionIncrementGap=3D"100"> > > > > > > > > > > > words=3D"stopwords.txt" enablePositionIncrements=3D"true" /> > > > > > > > > withOriginal=3D"true" > > > maxPosAsterisk=3D"3" maxPosQuestion=3D"2" > > > maxFractionAsterisk=3D"0.33"/> > > > > > > > > > > > > synonyms=3D"synonyms.txt" > > > ignoreCase=3D"true" expand=3D"true"/> > > > > > words=3D"stopwords.txt" enablePositionIncrements=3D"true" /> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > sortMissingLast=3D"true" omitNorms=3D"true"> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > pattern=3D"([^a-z])" replacement=3D"" replace=3D"all" > > > /> > > > > > > > > > > > > > > class=3D"solr.TextField" > > > > > > > > > > > inject=3D"false"/> > > > > > > > > > > > > > > class=3D"solr.TextField" > > > > > > > > > > > > > > > encoder=3D"float"/> > > > > > > > > > > > > > > > > > positionIncrementGap=3D"100"> > > > > > > > > > > > > > > > > > > > > > > > positionIncrementGap=3D"100"> > > > > > > > > > > > > > > > > > > > > > > > multiValued=3D"true" class=3D"solr.StrField" /> > > > > > > > > > > > subFieldSuffix=3D"_d"/> > > > > > > > > > > > subFieldSuffix=3D"_coordinate"/> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > required=3D"true" /> > > > > > > > > > > > stored=3D"true" required=3D"true" /> > > > > > > > > > > > required=3D"true" /> > > > > > > > > > stored=3D"true" > > > required=3D"true" /> > > > > > > > > > > > required=3D"true" /> > > > > > > > > > > > required=3D"true" /> > > > > > > > > > > > required=3D"true" /> > > > > > > > > > > > stored=3D"true" required=3D"true" /> > > > > > > > > > > stored=3D"true" > > > required=3D"true" /> > > > > > > > > > > stored=3D"true" > > > required=3D"true" /> > > > > > > > > > stored=3D"true" > > > required=3D"true" /> > > > > > > > > > > > stored=3D"true" required=3D"true" /> > > > > > > > > > > > stored=3D"true" required=3D"true" /> > > > > > > > > > > > stored=3D"true" required=3D"true" /> > > > > > > > > > > > stored=3D"true" required=3D"true" multiValued=3D"true"/> > > > > > > > > > > > > > > stored=3D"true" omitNorms=3D"true"/> > > > stored=3D"true"/> > > > > > stored=3D"false"/> > > > > > omitNorms=3D"true"/> > > > > > multiValued=3D"true"/> > > > > stored=3D"true" > > > multiValued=3D"true"/> > > > > stored=3D"true" > > > termVectors=3D"true" termPositions=3D"true" termOffsets=3D"true" /> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > multiValued=3D"true"/> > > > > stored=3D"true"/> > > > > > stored=3D"true"/> > > > > > stored=3D"true"/> > > > > stored=3D"true"/> > > > > > stored=3D"true"/> > > > > > stored=3D"true"/> > > > stored=3D"true" > > > multiValued=3D"true"/> > > > stored=3D"true"/> > > > > > multiValued=3D"true"/> > > > > > > > > > > > > > > multiValued=3D"true"/> > > > > > > > > > > > stored=3D"false" multiValued=3D"true"/> > > > > > > > > > > > > > > > > > > > > > > stored=3D"true"/> > > > > stored=3D"true"/> > > > > stored=3D"true"/> > > > > > stored=3D"true"/> > > > > > stored=3D"true" multiValued=3D"true"/> > > > > stored=3D"true"/> > > > > stored=3D"true"/> > > > > stored=3D"true"/> > > > > > > > > > > > stored=3D"false"/> > > > > > > > stored=3D"true"/> > > > > stored=3D"true"/> > > > > > > > > > > stored=3D"true"/> > > > > stored=3D"true"/> > > > > stored=3D"true"/> > > > > stored=3D"true"/> > > > > stored=3D"true"/> > > > > > > > > > > > > > > stored=3D"true" multiValued=3D"true"/> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > id > > > > > > > > > text > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sat, Jun 27, 2015 at 11:27 AM, Erick Erickson < > > erickerickson@gmail.com> > > > wrote: > > > > > > > This should be no different in 5.2 than 4.6. > > > > > > > > My first guess is a typo somewhere or some similar forehead-slapper= . > > > > Are you sure you're specifying the field in the "fl" list? > > > > > > > > Take a look at the index files, the *.fdt files are where the store= d > > data > > > > goes. You can't look into them, but for the same documents they > should > > > > be roughly the same aggregate size as they are in 4.6 > > > > 'du -hc *.fdt' will sum them all up for you (*nix). > > > > > > > > Second thing I'd do for sanity check is tail out the Solr log while > > > > indexing and querying, just to see "stuff" go by and see if any > > > > errors are thrown, although it sounds like you wouldn't see > > > > any search results at all if there was something wrong with > > > > indexing. > > > > > > > > And if none of that sheds any light, let's see the schema file? > > > > Maybe the results of adding &debug=3Dall to the query? > > > > > > > > Best, > > > > Erick > > > > > > > > On Fri, Jun 26, 2015 at 8:05 AM, Mark Ehle > wrote: > > > > > In my schema from 4.6, the text was in the 'text' field, and the > > > "stored" > > > > > attrib was set to "true" as it is in the 5.2 schema. I am ingesti= ng > > the > > > > > text from files on the server , and it used to work just fine wit= h > > > 4.6. I > > > > > am using the same schema except I had to get rid the field types > > pint, > > > > > plong, pfloat, pdouble and pdate. Otherwise, the schema is > identical. > > > > > > > > > > How do I tell SOLR 5.2 to store the text from a file to a certain > > > field? > > > > > > > > > > Thanks! > > > > > > > > > > > > > > > On Fri, Jun 26, 2015 at 7:29 AM, Alessandro Benedetti < > > > > > benedetti.alex85@gmail.com> wrote: > > > > > > > > > >> Actually storing or not storing a field is a simple schema.xml > > > > >> configuration. > > > > >> This suggestion can be obvious, but =E2=80=A6 have you checked y= ou have > your > > > > >> "stored" attribute set "true" for the field you are interested ? > > > > >> > > > > >> I am talking about the 5.2 schema. > > > > >> > > > > >> Cheers > > > > >> > > > > >> 2015-06-26 12:24 GMT+01:00 Mark Ehle : > > > > >> > > > > >> > Folks - > > > > >> > > > > > >> > I am using SOLR 4.6 to run a newspaper indexing site we have a= t > > the > > > > >> library > > > > >> > I work at. I would like to update to 5.2, and I have an instan= ce > > of > > > it > > > > >> > running. When I go to index the txt files of each newspaper > page, > > I > > > > can > > > > >> > search and find stuff, but there is no text stored any more. I > do > > > use > > > > >> > highlighting so I need the text there. > > > > >> > > > > > >> > What would be different about 5.2 that would account for this? > > > > >> > > > > > >> > Thanks! > > > > >> > > > > > >> > Mark Ehle > > > > >> > Computer Support Librarian > > > > >> > Willard Library > > > > >> > Battle Creek,MI > > > > >> > > > > > >> > > > > >> > > > > >> > > > > >> -- > > > > >> -------------------------- > > > > >> > > > > >> Benedetti Alessandro > > > > >> Visiting card : http://about.me/alessandro_benedetti > > > > >> > > > > >> "Tyger, tyger burning bright > > > > >> In the forests of the night, > > > > >> What immortal hand or eye > > > > >> Could frame thy fearful symmetry?" > > > > >> > > > > >> William Blake - Songs of Experience -1794 England > > > > >> > > > > > > > > > > > > > > > -- > > -------------------------- > > > > Benedetti Alessandro > > Visiting card : http://about.me/alessandro_benedetti > > > > "Tyger, tyger burning bright > > In the forests of the night, > > What immortal hand or eye > > Could frame thy fearful symmetry?" > > > > William Blake - Songs of Experience -1794 England > > > --=20 -------------------------- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England --047d7b3a9cb02d44880519be7534--