lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Kan <dmitry.luc...@gmail.com>
Subject Re: [Result Query Solr] How to retrieve the content of pdfs
Date Wed, 21 Sep 2016 05:43:28 GMT
Hi Alexandre,

Could you add fl=* to your query and check the output? Alternatively, have
a look at your schema file and check what could look like content field:
text or similar.

Dmitry

14 сент. 2016 г. 1:27 AM пользователь "Alexandre Martins" <
alexandremartins@gmail.com> написал:

> Hi Guys,
>
> I'm trying to use the last version of solr and i have used the post tool
> to upload 28 pdf files and it works fine. However, I don't know how to show
> the content of the files in the resulted json. Anybody know how to include
> this field?
>
> "responseHeader":{ "zkConnected":true, "status":0, "QTime":43, "params":{
> "q":"ABC", "indent":"on", "wt":"json", "_":"1473804420750"}}, "response":
> {"numFound":40,"start":0,"maxScore":9.1066065,"docs":[ { "id":
> "/home/alexandre/desenvolvimento/workspace/solr-6.2.0/pdfs_hack/abc.pdf",
> "date":["2016-09-13T14:44:17Z"], "pdf_pdfversion":[1.5], "xmp_creatortool
> ":["PDFCreator Version 1.7.3"], "stream_content_type":["application/pdf"],
> "access_permission_modify_annotations":[false], "
> access_permission_can_print_degraded":[false], "dc_creator":["abc"], "
> dcterms_created":["2016-09-13T14:44:17Z"], "last_modified":["2016-09-
> 13T14:44:17Z"], "dcterms_modified":["2016-09-13T14:44:17Z"], "dc_format":["application/pdf;
> version=1.5"], "title":["ABC tittle"], "xmpmm_documentid":["uuid:
> 100ccff2-7c1c-11e6-0000-ab7b62fc46ae"], "last_save_date":["2016-09-
> 13T14:44:17Z"], "access_permission_fill_in_form":[false], "meta_save_date
> ":["2016-09-13T14:44:17Z"], "pdf_encrypted":[false], "dc_title":["Tittle
> abc"], "modified":["2016-09-13T14:44:17Z"], "content_type":["application/
> pdf"], "stream_size":[101948], "x_parsed_by":["org.apache.
> tika.parser.DefaultParser", "org.apache.tika.parser.pdf.PDFParser"], "
> creator":["mauricio.tostes"], "meta_author":["mauricio.tostes"], "
> meta_creation_date":["2016-09-13T14:44:17Z"], "created":["Tue Sep 13
> 14:44:17 UTC 2016"], "access_permission_extract_for_accessibility":[false],
> "access_permission_assemble_document":[false], "xmptpg_npages":[3], "
> creation_date":["2016-09-13T14:44:17Z"], "resourcename":["/home/
> alexandre/desenvolvimento/workspace/solr-6.2.0/pdfs_hack/abc.pdf"], "
> access_permission_extract_content":[false], "access_permission_can_print":
> [false], "author":["abc.add"], "producer":["GPL Ghostscript 9.10"], "
> access_permission_can_modify":[false], "_version_":1545395897488113664},
>
> Alexandre Costa Martins
> DATAPREV - IT Analyst
> Software Reuse Researcher
> MSc Federal University of Pernambuco
> RiSE Member - http://www.rise.com.br
> Sun Certified Programmer for Java 5.0 (SCPJ5.0)
>
> MSN: xandecmartins@hotmail.com
> GTalk: alexandremartins@gmail.com
> Skype: xandecmartins
> Mobile: +55 (85) 9626-3631
>

Mime
View raw message