lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yogendra Kumar Soni <yogendra.ku...@dolcera.com>
Subject Re: Require searching only for file content and not metadata
Date Tue, 27 Aug 2019 22:38:18 GMT
It will be easier to parse documents create content, metadata and other
required fields yourself in place of using default post tool. You will have
better control on what is going to  which field.


On Tue 27 Aug, 2019, 6:48 PM Khare, Kushal (MIND), <
Kushal.Khare@mind-infotech.com> wrote:

> Basically, what problem I am facing is - I am getting the textual content
> + other metadata in my _text_ field. But, I want only the textual content
> written inside the document.
> I tried various Request Handler Update Extract configurations, but none of
> them worked for me.
> Please help me resolve this as I am badly stuck in this.
>
> -----Original Message-----
> From: Khare, Kushal (MIND) [mailto:Kushal.Khare@mind-infotech.com]
> Sent: 27 August 2019 12:59
> To: solr-user@lucene.apache.org; chris@christopherschultz.net
> Subject: RE: Require searching only for file content and not metadata
>
> Chris,
> What I have done is, I just created a core, used POST tool to index the
> documents from my file system, and then moved to Solr Admin for querying.
> For 'Metadata' vs 'Content' , I mean that I just want the field '_text_'
> to be searched for, instead of all the fields that solr creates by itself
> like - author name. last modified, creator, id, etc.
> I simply want solr to search only for the content inside the document (the
> body of the document) & not on all the fields. For an example, if I search
> for 'Kushal', it should return the document only if it has the word in it
> as the content, not because it has author name or owner as Kushal.
> Hope its clear than before now. Please help me with this !
>
> Thankyou!
> Kushal Khare
>
> -----Original Message-----
> From: Christopher Schultz [mailto:chris@christopherschultz.net]
> Sent: 26 August 2019 18:47
> To: solr-user@lucene.apache.org
> Subject: Re: Require searching only for file content and not metadata
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Kushal,
>
> On 8/26/19 07:52, Khare, Kushal (MIND) wrote:
> > This is Kushal Khare, a new addition to the user-list. I started
> > working with Solr few days ago for implementing it in my project.
> >
> > Now, I have the basics done, and reached the query stage.
> >
> > My problem is – I need to restrict the solr to search only for the
> > file content and not the metadata. I have gone through various
> > articles on the internet, but could not get any help.
> >
> > Therefore, I hope I could get some solutions here.
>
> How are you querying Solr? Are you querying from a web application? From a
> thick-client application? Directly from a web browser?
>
> What do you consider "metadata" versus "content"? To Solr, everything is
> the same...
>
> - -chris
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
>
> iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAl1j268ACgkQHPApP6U8
> pFi6GA//VY8SU6H5T3G6fpUqQrVp05E9g7f0oGGVW1eaRY3NjgQzfbwJQmJqg16Y
> MyUKpp0/P6EpR/dMPmiKBPvLppSqjT1SUNgrFi2btwtBaTibxWXd0WtEqNdinWCo
> DFyJaPQaIT20IR887SPWrQSYc4oC8aKNAEDAXxlyWDzEgImE23AyCeWs++gJsaKm
> RphkleBeIKCX6SkRzDFeEzx4VyKBZKcjI+Ks/9z2s9tcGmElxyMDPHYf5VXJQgcz
> A1D3jPVPqm2OMvThXd2ll4NlnXe2PWV5eYfZQt/6YMwx4jF+rqG66jDXEhTHzDro
> jmiZVj1VbQ0RlFLqP6OHu2YRj+01a0OtE8l4mWiGSNIrKymp+ycT9E+L0eC9yGIT
> hLUfo7a3ONfOTTNAbuI/363+2WA1wBxSHm2m3kQT8Ho8ydjd7w/umR1L6/wr+q9B
> jEZfAHs1TLFXd6lgqLtmIyf6Ya5bloWM+yjwnjfpniOuHCcXTiJi+5GvxLwih8yE
> 6CQ32kIUuspJ7N5hyiJvM4AcuWWMldDlZaYoHuUwhVbWCCT+Y4X6R1+IZfyXZnvn
> wFEMD3+3r382M3G0uyh2MJk899l1kSPcX+BtRg3pOqDZh0WR+2xWpTndeiMxsmGj
> UC1J1PssKUa1P0dMk7wLvgOl0BiiGC+WwgD7ZfHjF7NPL1jPtW8=
> =LWwW
> -----END PGP SIGNATURE-----
>
> ________________________________
>
> The information contained in this electronic message and any attachments
> to this message are intended for the exclusive use of the addressee(s) and
> may contain proprietary, confidential or privileged information. If you are
> not the intended recipient, you should not disseminate, distribute or copy
> this e-mail. Please notify the sender immediately and destroy all copies of
> this message and any attachments. WARNING: Computer viruses can be
> transmitted via email. The recipient should check this email and any
> attachments for the presence of viruses. The company accepts no liability
> for any damage caused by any virus/trojan/worms/malicious code transmitted
> by this email. www.motherson.com
>
> ________________________________
>
> The information contained in this electronic message and any attachments
> to this message are intended for the exclusive use of the addressee(s) and
> may contain proprietary, confidential or privileged information. If you are
> not the intended recipient, you should not disseminate, distribute or copy
> this e-mail. Please notify the sender immediately and destroy all copies of
> this message and any attachments. WARNING: Computer viruses can be
> transmitted via email. The recipient should check this email and any
> attachments for the presence of viruses. The company accepts no liability
> for any damage caused by any virus/trojan/worms/malicious code transmitted
> by this email. www.motherson.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message