lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Evert R." <evert.ra...@gmail.com>
Subject Re: Solr Basic Configuration - Highlight - Begginer
Date Wed, 16 Dec 2015 16:21:18 GMT
Hi Andrea,

ok, let´s do it:

1. it does has the 'nietava' term, so it brings the only book (pdf file)
has this word, and all its content as my previous message to Erick, so the
content field is there.

2. using content:nietava it does not show any result.... as below:

{ "responseHeader": { "status": 400, "QTime": 12, "params": { "q":
"contents:nietava", "indent": "true", "fl": "id", "wt": "json", "_":
"1450282631352" } }, "error": { "msg": "undefined field contents", "code":
400 } }

3. Here is what I found when grepping 'content' from the techproducts conf
folder:

schema.xml: <field name="content_type" type="string" indexed="true"
stored="true" multiValued="true"/> schema.xml: <field name="content"
type="text_general" indexed="false" stored="true" multiValued="true"/>
schema.xml: <copyField source="content" dest="text"/> schema.xml:
<copyField source="content_type" dest="text"/> solrconfig.xml: <str
name="facet.field">content_type</str> solrconfig.xml: <str
name="hl.fl">content features title name</str> solrconfig.xml: <str
name="f.content.hl.snippets">3</str> solrconfig.xml: <str
name="f.content.hl.fragsize">200</str> solrconfig.xml: <str
name="f.content.hl.alternateField">content</str> solrconfig.xml: <str
name="f.content.hl.maxAlternateFieldLength">750</str> solrconfig.xml: <str
name="stream.contentType">application/json</str> solrconfig.xml: <str
name="stream.contentType">application/csv</str> solrconfig.xml: <str
name="content-type">text/plain; charset=UTF-8</str>

and the grep on 'content_type':

schema.xml:   <field name="content_type" type="string" indexed="true"
stored="true" multiValued="true"/>
schema.xml:   <copyField source="content_type" dest="text"/>
solrconfig.xml:       <str name="facet.field">content_type</str>

=)

Thanks for checking out.



*Evert ​​*

2015-12-16 12:59 GMT-02:00 Andrea Gazzarini <a.gazzarini@gmail.com>:

> hl=f.content.hl.content (I guess) is definitely wrong. Some questions:
>
>    - First, sorry, the obvious question: are you sure the documents contain
>    the "nietava" term?
>    - Could you try to use q=content:nietaval?
>    - Could you paste the definition (field & fieldtype) of the content
>    field?
>
> > Should I have this configuration in the XML file?
>
> You could, but it's up to you and it strongly depends on your context. The
> simple thing is that if you have those parameters within the configuration
> you can avoid to pass them (as part of the requests), but probably in this
> phase, where you are testing, it's better to have them there (in the
> request).
>
> Andrea
>
> 2015-12-16 15:28 GMT+01:00 Evert R. <evert.ramos@gmail.com>:
>
> > Hi Andrea,
> >
> > Thanks for the reply!
> >
> > I tried with the hl.fl parameter as well, using as below:
> >
> >
> >
> http://localhost:8983/solr/techproducts/select?q=nietava&fl=id%2C+content&wt=json&indent=true&hl=true&
> >
> >
> hl.fl=f.content.hl.content%3D4&hl.simple.pre=%3Cem%3E&hl.simple.post=%3C%2Fem%3E
> >
> > with the parameter under the hl field in the solr ui:
> >
> > 1. f.content.hl.snnipets=2
> > 2. f.content.hl.content=4
> > 3. content
> >
> > with no success...
> >
> > Should I have this configuration in the XML file?
> >
> > Regards,
> >
> > *Evert *
> >
> > 2015-12-16 11:23 GMT-02:00 Andrea Gazzarini <a.gazzarini@gmail.com>:
> >
> > > Hi Evert,
> > > what is the configuration of the default request handler? Did you set
> the
> > > hl.fl parameter?
> > >
> > > Please check here [1] the parameters that the highlighting component
> > > expects. Required parameters should be in the query string or declared
> > > within the request handler which answers to your query.
> > >
> > > Andrea
> > >
> > > [1] https://wiki.apache.org/solr/HighlightingParameters
> > >
> > >
> > >
> > >
> > > 2015-12-16 12:51 GMT+01:00 Evert R. <evert.ramos@gmail.com>:
> > >
> > > > Hi everyone!
> > > >
> > > > I think I should not have posted my server name... never had that
> many
> > > > access attempts...
> > > >
> > > >
> > > >
> > > > 2015-12-16 9:03 GMT-02:00 Evert R. <evert.ramos@gmail.com>:
> > > >
> > > > > Hello Erick,
> > > > >
> > > > > Thanks again for your time.
> > > > >
> > > > > Here is as far as I have gone:
> > > > >
> > > > > 1. I started a fresh install and did the following:
> > > > >
> > > > > [evert@nix]$ bin/solr start -e techproducts
> > > > > [evert@nix]$ curl '
> > > > >
> > > >
> > >
> >
> http://localhost:8983/solr/techproducts/update/extract?literal.id=pdf1&commit=true
> > > > '
> > > > > -F "Emmanuel=@/home/solr/dados/teste/Emmanuel.pdf"
> > > > >
> > > > > 2. I am using only the Solr Admin UI to check the query respond,
> here
> > > is
> > > > > an example:
> > > > >
> > > > > Query: http://
> > > > > ​localhost
> > > > >
> > > > >
> > > >
> > >
> >
> :8983/solr/techproducts/select?q=nietava&fl=id%2C+author%2C+content&wt=json&indent=true&hl=true&hl.simple.pre=%3Cem%3E&hl.simple.post=%3C%2Fem%3E
> > > > >
> > > > > Result: {
> > > > >   "responseHeader": {
> > > > >     "status": 0,
> > > > >     "QTime": 14,
> > > > >     "params": {
> > > > >       "q": "nietava",
> > > > >       "hl": "true",
> > > > >       "hl.simple.post": "</em>",
> > > > >       "indent": "true",
> > > > >       "fl": "id, author, content",
> > > > >       "wt": "json",
> > > > >       "hl.simple.pre": "<em>",
> > > > >       "_": "1450262674102"
> > > > >     }
> > > > >   },
> > > > >   "response": {
> > > > >     "numFound": 1,
> > > > >     "start": 0,
> > > > >     "docs": [
> > > > >       {
> > > > >         "id": "pdf1",
> > > > >         "author": "Wander",
> > > > >         "content": [
> > > > >           "André Luiz - Sexo e Destino _Chico e Waldo_.doc \n \n
> > \n
> > > > > Francisco Cândido Xavier \ne \n \n Waldo Vieira \n \n \n \n \n
> Sexo e
> > > > > Destino \n \n \n \n 12o livro da Coleção \n“A Vida no Mundo
> > Espiritual”
> > > > \n
> > > > > \n  \n \n \n \n Ditado pelo Espírito \nAndré Luiz \n \n  \n \n
\n
> \n
> > \n
> > > > \n
> > > > > \n FEDERAÇÃO ESPÍRITA BRASILEIRA \nDEPARTAMENTO EDITORIAL \n \n
Rua
> > > Souza
> > > > > Valente, 17 \n20941-040 - Rio - RJ - Brasil \n \n  \nhttp://
> > > > > www.febnet.org.br/  \n  \n \n   \n Francisco Cândido Xavier -
> Sexo e
> > > > > Destino - pelo Espírito André Luiz \n \n  \n2 \n \n  \n \n \n \n
> > > Coleção
> > > > > \n“A Vida no Mundo Espiritual” \n"
> > > > >         ]
> > > > >       }
> > > > >     ]
> > > > >   },
> > > > >   "highlighting": {
> > > > >     "pdf1": {}
> > > > >   }
> > > > > }
> > > > >
> > > > > **On the content it brings the whole pdf content (book), and notice
> > > that
> > > > > in the highlight it shows empty.
> > > > >
> > > > > I tried creating a new core with bin/solr create -c test, using the
> > > > > schema.xml and solrconfig.xml standard found in
> > > > > /solr/server/solr/configsets/basic_configs/conf
> > > > >
> > > > > But even though... not working as expected (I think).
> > > > >
> > > > >
> > > > > Would you know how to set this techproducts example to bring the
> > > snnipets
> > > > > of text?
> > > > >
> > > > > The server only allows specific ip address for this port, if you
> > > would, I
> > > > > could get it open for you to check.
> > > > >
> > > > >
> > > > > Thanks again and best regards!
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > *Evert
> > > > >
> > > > >
> > > > > 2015-12-15 18:14 GMT-02:00 Erick Erickson <erickerickson@gmail.com
> >:
> > > > >
> > > > >> No, that's not what I meant. The highlight component adds a
> special
> > > > >> section to the return packet that will contain "snippets" of
text
> > with
> > > > >> highlights. You control how big those snippets are via various
> > > > >> parameters in the highlight component and they'll have the tags
> you
> > > > >> specify for highlighting.
> > > > >>
> > > > >> Your app needs to pull the information from the highlight portion
> of
> > > > >> the response packet rather than the document list. Just execute
> your
> > > > >> queries via cURL or a browser to see the structure of a response
> to
> > > > >> see what I mean.
> > > > >>
> > > > >> And note that you do _not_ need to return the fields you're
> > > > >> highlighting in the "fl" list so you do _not_ need to return
the
> > > > >> entire document contents.
> > > > >>
> > > > >> What are you using to display the results anyway?
> > > > >>
> > > > >> Best,
> > > > >> Erick
> > > > >>
> > > > >> On Tue, Dec 15, 2015 at 10:02 AM, Evert R. <evert.ramos@gmail.com
> >
> > > > wrote:
> > > > >> > Hi Erick,
> > > > >> >
> > > > >> > Thank you very much for the reply!!
> > > > >> >
> > > > >> > I do get back the full text, autor, and a whole lots of
stuff
> > which
> > > > >> doesn´t
> > > > >> > really matter for my project.
> > > > >> >
> > > > >> > So, what you are saying is that the solr gets me back the
full
> > > content
> > > > >> and
> > > > >> > my application will fix the rest? Which means for me that
all my
> > > books
> > > > >> (pdf
> > > > >> > files) when searching for an specific word it will bring
me the
> > > whole
> > > > >> book
> > > > >> > content that has the requested query. And my application
(php)
> in
> > > this
> > > > >> > case... will take care of show only part of the text (such
as in
> > > > >> highlight,
> > > > >> > as I was understandind) and hightlight the key word I was
> looking
> > > for?
> > > > >> >
> > > > >> > If so, Erick, you gave me a big help clearing out... I thought
I
> > > would
> > > > >> do
> > > > >> > that with Solr in an easy way. =)
> > > > >> >
> > > > >> > Thanks for the attachements tip!
> > > > >> >
> > > > >> > Best regards,
> > > > >> >
> > > > >> > Evert
> > > > >> >
> > > > >> > 2015-12-15 14:56 GMT-02:00 Erick Erickson <
> > erickerickson@gmail.com
> > > >:
> > > > >> >
> > > > >> >> How are you trying to display the results? Highlighting
is a
> bit
> > of
> > > > an
> > > > >> >> odd beast. Assuming it's correctly configured, the response
> > packet
> > > > >> >> will have a separate highlight section, it's the application's
> > > > >> >> responsibility to present that pleasingly.
> > > > >> >>
> > > > >> >> What _do_ you get bak in the response?
> > > > >> >>
> > > > >> >> BTW, the mail sever pretty aggressively strips attachments,
> > your's
> > > > >> >> didn't come through.
> > > > >> >>
> > > > >> >> Best,
> > > > >> >> Erick
> > > > >> >>
> > > > >> >> On Tue, Dec 15, 2015 at 3:25 AM, Evert R. <
> evert.ramos@gmail.com
> > >
> > > > >> wrote:
> > > > >> >> > Hi there!
> > > > >> >> >
> > > > >> >> > It´s my first installation, not sure if here is
the right
> > > > channel...
> > > > >> >> >
> > > > >> >> > Here is my steps:
> > > > >> >> >
> > > > >> >> > 1. Set up a basic install of solr 5.4.0
> > > > >> >> >
> > > > >> >> > 2. Create a new core through command line (bin/solr
create -c
> > > test)
> > > > >> >> >
> > > > >> >> > 3. Post 2 files: 1 .docx and 2 .pdf (bin/post -c
test
> > > /docs/test/)
> > > > >> >> >
> > > > >> >> > 4. Query over the browser and it brings the correct
search,
> but
> > > it
> > > > >> does
> > > > >> >> not
> > > > >> >> > show the part of the text I am querying, the highlight.
> > > > >> >> >
> > > > >> >> >   I have already flagled the 'hl' option. But still
it does
> not
> > > > >> word...
> > > > >> >> >
> > > > >> >> > Exemple: I am looking for the word 'peace' in my
pdf file
> > (book)
> > > I
> > > > >> have 4
> > > > >> >> > matches for this word, it shows me the book name
(pdf file)
> but
> > > > does
> > > > >> not
> > > > >> >> > bring which part of the text it has the word peace
on it.
> > > > >> >> >
> > > > >> >> >
> > > > >> >> > I am problably missing some configuration in schema.xml,
> which
> > is
> > > > >> missing
> > > > >> >> > from my folder.... /solr/server/solr/test/conf/
> > > > >> >> >
> > > > >> >> > Or even the solrconfig.xml...
> > > > >> >> >
> > > > >> >> > I have read a bunch of things about highlight check
these
> > files,
> > > > >> copied
> > > > >> >> the
> > > > >> >> > standard schema.xml to my core/conf folder, but
still it does
> > not
> > > > >> bring
> > > > >> >> the
> > > > >> >> > highlight.
> > > > >> >> >
> > > > >> >> >
> > > > >> >> > Attached a copy of my solrconfig.xml file.
> > > > >> >> >
> > > > >> >> >
> > > > >> >> > I am very sorry for this, probably, dumb and too
basic
> > > question...
> > > > >> First
> > > > >> >> > time I see solr in live.
> > > > >> >> >
> > > > >> >> >
> > > > >> >> > Any help will be appreciated.
> > > > >> >> >
> > > > >> >> >
> > > > >> >> >
> > > > >> >> > Best regards,
> > > > >> >> >
> > > > >> >> >
> > > > >> >> > Evert Ramos
> > > > >> >> >
> > > > >> >> > evert.ramos@gmail.com
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message