lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mirko Torrisi <mirko.torr...@ucdconnect.ie>
Subject Re: how to store _text field
Date Tue, 28 Apr 2015 11:29:05 GMT
Hi guys,

I used the Erick's suggestions (thanks again!!) to create a new field and
copy in it the _text content.

curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field" : { "name":"content", "type":"string", "indexed":true,
"stored":true}, "add-copy-field" : { "source":"_text", "dest": [
"content"]}}' http://localhost:8983/solr/Test/schema

That seems a good way but I discovered the presence of "bias" in every
content field. Indeed, they start with a string of this kind:

 \n \n stream_content_type text/plain  \n stream_size 1556  \n
Content-Encoding UTF-8  \n X-Parsed-By
org.apache.tika.parser.DefaultParser  \n X-Parsed-By
org.apache.tika.parser.txt.TXTParser  \n Content-Type text/plain;
charset=UTF-8  \n resourceName /home/mirko/Desktop/data
sample/sample1/TEXT_CRE_20110608_3-114-500.txt

Now I need to cut off this part but I have no idea also because the path
(present in the last part) has a dynamic length.

For someone could be a problem to have two field with the same content
(double space needed). I have not this problem because I use Solrj to
import, modify and export each document. Maybe I could use it to do also
this but hopefully you know a cleaner method.

Cheers,
Mirko


Mirko

On 19 March 2015 at 20:11, Erick Erickson <erickerickson@gmail.com> wrote:

> Hmm, not all that sure. That's one thing about schemaless indexing, it
> has to guess. It does the best it can, but it's quite possible that it
> guesses wrong.
>
> If this is a "mananged schema", you can use the REST API commands to
> make whatever field you want. Or you can start over with a concrete
> schema.xml and use _that_. Otherwise, I'm not sure what to say without
> actually being on your system.
>
> Wish I could help more.
> Erick
>
> On Thu, Mar 19, 2015 at 5:39 AM, Mirko Torrisi
> <mirko.torrisi@ucdconnect.ie> wrote:
> > Hi Erick,
> >
> > I'm sorry for this delay but I've just seen this reply.
> >
> > I'm using the last version of solr and the default setting is to use the
> new
> > kind of indexing, it doesn't use schema.xml and for that I have no idea
> > about how set "store" for this field.
> > The content is grabbed because I've obtained results using the search
> > function but it is not showed because it is not setted to "store".
> >
> > I hope to be clear.
> > Thanks very much.
> >
> > All the best,
> >
> > Mirko
> >
> >
> > On 14/03/15 17:58, Erick Erickson wrote:
> >>
> >> Right, your schema.xml file will define, perhaps, some "dynamic
> >> fields". First insure that stored="true" is specified. If you change
> >> this, you have to re-index the docs.
> >>
> >> Second, insure that your "fl" parameter with the field is specified on
> >> the requests, something like q=*:*&fl=eoe_txt.
> >>
> >> Third, insure that you are actually sending content to that field when
> >> you index docs.
> >>
> >> If none of this helps, show us the definition from schema.xml and a
> >> sample input document and a query that illustrate the problem please.
> >>
> >> Best,
> >> Erick
> >>
> >> On Fri, Mar 13, 2015 at 1:20 AM, Mirko Torrisi
> >> <mirko.torrisi@ucdconnect.ie> wrote:
> >>>
> >>> Hi Alexandre,
> >>>
> >>> I need to visualize the content of _txt. For some reasons, actual it is
> >>> not
> >>> showed in the results (the "response").
> >>> I guess that it doesn't happen because it isn't stored (for some
> default
> >>> setting that I'd like to change).
> >>>
> >>> Thanks for your help,
> >>>
> >>> Mirko
> >>>
> >>>
> >>> On 13/03/15 00:27, Alexandre Rafalovitch wrote:
> >>>>
> >>>> Wait, step back. This is confusing. What's your real problem you are
> >>>> trying to solve?
> >>>>
> >>>> Regards,
> >>>>      Alex.
> >>>> ----
> >>>> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
> >>>> http://www.solr-start.com/
> >>>>
> >>>>
> >>>> On 12 March 2015 at 19:50, Mirko Torrisi <mirko.torrisi@ucdconnect.ie
> >
> >>>> wrote:
> >>>>>
> >>>>> Hi folks,
> >>>>>
> >>>>> I googled and tried without success so I ask you: how can I modify
> the
> >>>>> setting of a field to store it ?
> >>>>>
> >>>>> It is interesting to note that I did not add _text field so I guess
> it
> >>>>> is
> >>>>> a
> >>>>> default one. Maybe it is normal that it is not showed on the result
> but
> >>>>> actually this is my real problem. It could be grand also to copy
it
> in
> >>>>> a
> >>>>> new
> >>>>> field but I do not know how to do it with the last Solr (5) and
the
> new
> >>>>> kind
> >>>>> of schema. I know that I have to use curl but I do not know how
to
> use
> >>>>> it
> >>>>> to
> >>>>> copy a field.
> >>>>>
> >>>>> Thank you in advance!
> >>>>> Cheers,
> >>>>>
> >>>>>    Mirko
> >>>
> >>>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message