lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrea Gazzarini <a.gazzar...@gmail.com>
Subject Re: AW: Nested entities not imported / do not show up in search?
Date Mon, 19 Oct 2015 10:04:34 GMT
Most probably my answer makes no sense because I don't know the overall
context, but why don't you import flat branches and companies with a "type"
attribute ("company" or "branch") and a "owner" field that will be
valorized only for braches with the company id ? Then you could
autocomplete on the company name (fq=type:"company"). Once selected a
company it would be just a matter of another query with two fq:
type:"branch", owner: <selected company id>

Andrea
On 19 Oct 2015 11:48, "Matthias Fischer" <Matthias.Fischer@doubleslash.de>
wrote:

> Ok, thanks for your advice so far. I can import companies with their
> nested entities (business branches) now. But I wonder whether there is a
> way to query for company name patterns and get the business branches nested
> inside the respective companies. Using the following query I only get the
> companies without their nested entities:
>
> http://localhost:8983/solr/jcg/select?q=firma_namenszeile_1%3AMu*&wt=xml&indent=true
>
> I can use the firma_ebi_nr (the company id) and get the associated
> branches by issueing the following query:
>
> http://localhost:8983/solr/jcg/select?q={!child%20of=%22firma_ebi_nr:123123%22}firma_ebi_nr:123123
> This results in a flat list of associated business branches. However I
> would like to search a company by name and in the result I would like to
> see all associated business branches nested inside the respective company.
> Is this possible or do I need to issue the second query above for each
> company search result in order to get the nested entities?
>
> Example of what I would like to achieve:
>
>
> http://localhost:8983/solr/jcg/select?q=firma_namenszeile_1%3AMust*&wt=xml&indent=true
>
> <response>
>     <lst name="responseHeader">
>         <int name="status">0</int>
>         <int name="QTime">1</int>
>         <lst name="params">
>             <str name="q">firma_namenszeile_1:Must*</str>
>             <str name="indent">true</str>
>             <str name="wt">xml</str>
>         </lst>
>     </lst>
>     <result name="response" numFound="2" start="0">
>     <doc>
>         <long name="firma_ebi_nr">123123</long>
>         <str name="firma_namenszeile_1">Musterfirma</str>
>         <str name="id">ac8d5627-b17a-bbbb-8926-8d5a80680ee4</str>
>         <long name="_version_">1515205299087081472</long>
>
>         <!-- nested branches -->
>         <doc>
>             <long name="branche_ebc_code">6</long>
>         </doc>
>         <doc>
>             <long name="branche_ebc_code">43000</long>
>         </doc>
>         <doc>
>             <long name="branche_ebc_code">43900</long>
>         </doc>
>
>      </doc>
>      ....
> </response>
>
>
> Is this possible? Or maybe there is a better way than nested enties? An
> alternative I could think of is to join companies and branches in the JDBC
> import. But this would result in duplicate companies in the search result
> (one for each associated branch). My goal is to have a suggest field where
> the user can type a company name pattern and gets a list of matching
> companies including the associated branches. Any suggestions?
>
> Kind regards,
> Matthias
>
> -----Urspr√ľngliche Nachricht-----
> Von: Andrea Gazzarini [mailto:a.gazzarini@gmail.com]
> Gesendet: Freitag, 16. Oktober 2015 17:24
> An: solr-user@lucene.apache.org
> Betreff: Re: Nested entities not imported / do not show up in search?
>
> Hi Matthias,
> I guess the company.id field is not unique so you need a "compound"
> uniqueKey on Solr, which is not strctly possible. As consequence of that
> (company) UUID is probably created before the index phase by an
> UpdateRequestProcessor [1] so you should check your solrconfig.xml and, if
> I'm right, check if the same strategy could be used for the nested entities.
>
> Andrea
>
> [1]
>
> http://lucene.apache.org/solr/5_2_1/solr-core/org/apache/solr/update/processor/UUIDUpdateProcessorFactory.html
>
> 2015-10-16 17:11 GMT+02:00 Matthias Fischer <
> Matthias.Fischer@doubleslash.de
> >:
>
> > Thank you, Andrea, for answering so quickly.
> >
> > However I got further errors. I also had to change
> > "<uniqueKey>firma_ebi_nr</uniqueKey>" to "<uniqueKey>id</uniqueKey>".
> > But it still does not work properly. It seems that an id is auto
> > generated for the company documents but not for the nested ones (the
> business branches).
> > Any ideas how to fix this?
> >
> > 2015-10-16 12:49:29.650 WARN  (Thread-17) [   x:jcg] o.a.s.h.d.SolrWriter
> > Error creating document :
> > SolrInputDocument(
> >     fields: [firma_ebi_nr=317709682, firma_namenszeile_1=Example
> > Company, id=3c7f7421-9d51-4056-a2a0-eebab87a546a,
> > _version_=1515192078460518400,
> _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a],
> >     children: [
> >            SolrInputDocument(fields: [branche_ebc_code=7,
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
> >            SolrInputDocument(fields: [branche_ebc_code=47000,
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
> >            SolrInputDocument(fields: [branche_ebc_code=47700,
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
> >            SolrInputDocument(fields: [branche_ebc_code=47790,
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a]),
> >            SolrInputDocument(fields: [branche_ebc_code=47791,
> > _root_=3c7f7421-9d51-4056-a2a0-eebab87a546a])])
> > org.apache.solr.common.SolrException: [doc=null] missing required field:
> id
> >         at
> >
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:198)
> >         at
> > org.apache.solr.update.AddUpdateCommand$1.next(AddUpdateCommand.java:191)
> >         at
> > org.apache.solr.update.AddUpdateCommand$1.next(AddUpdateCommand.java:166)
> >         at
> >
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:259)
> >         at
> >
> org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:413)
> >         at
> >
> org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1316)
> >         at
> >
> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:235)
> >         at
> >
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:163)
> >         at
> >
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
> >         at
> >
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
> >         at
> >
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
> >         at
> >
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
> >         at
> >
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
> >         at
> >
> org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:104)
> >         at
> >
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
> >         at
> >
> org.apache.solr.update.processor.AbstractDefaultValueUpdateProcessorFactory$DefaultValueUpdateProcessor.processAdd(AbstractDefaultValueUpdateProcessorFactory.java:94)
> >         at
> > org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:71)
> >         at
> >
> org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:259)
> >         at
> >
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:524)
> >         at
> >
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
> >         at
> >
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
> >         at
> >
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
> >         at
> >
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
> >         at
> >
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
> >         at
> > org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.jav
> > a:461)
> >
> > Kind regards,
> > Matthias
> >
> > -----Urspr√ľngliche Nachricht-----
> > Von: Andrea Gazzarini [mailto:a.gazzarini@gmail.com]
> > Gesendet: Freitag, 16. Oktober 2015 13:59
> > An: solr-user@lucene.apache.org
> > Betreff: Re: Nested entities not imported / do not show up in search?
> >
> > Hi Matthias,
> > you should use <entity-name>.<column-name> in your expressions. So for
> > example, here
> >
> > WHERE fb.EBI_NR='${firma.firma_ebi_nr}'
> >
> > should be
> >
> > WHERE fb.EBI_NR='${firma.EBI_NR}'
> >
> > Best,
> > Andrea
> >
> > 2015-10-16 13:40 GMT+02:00 Matthias Fischer <
> > Matthias.Fischer@doubleslash.de
> > >:
> >
> > > Hello everybody,
> > >
> > > I am trying to import from an Oracle DB 11g2 via DIH using SOLR 5.3.1.
> > > In my relational DB there are company addresses (table
> > > tb_firmen_adressen) and branches (table tb_branchen). They have an
> > > n:m relationship using the join table tb_firmen_branchen.
> > > Now I would like to find companies by their name and in each company
> > > result I would like to see the associated branches.
> > > However I only get the companies without the nested entries. As a
> > > newbie I'd highly appreciate some help as there are no errors or
> > > warnings in the log file and I could not find any helpful hints in
> > > the documentation or elsewhere in the internet concerning my problem.
> > >
> > > Here is my data config:
> > >
> > >     <dataConfig>
> > >         <dataSource name="jdbc"
> driver="oracle.jdbc.driver.OracleDriver"
> > > url="jdbc:oracle:thin:@//xxxxx.xxxxxxx:1521/pde11" user="myuser"
> > > password="mysecret"/>
> > >     <document>
> > >         <entity name="firma" pk="fa.EBI_NR" query="
> > >             SELECT fa.EBI_NR, fa.NAMENSZEILE_1, fa.NAMENSZEILE_2,
> > > fa.NAMENSZEILE_3
> > >             FROM tb_firmen_adressen fa
> > >             WHERE rownum &lt; 10000
> > >         ">
> > >
> > >             <field name="firma_ebi_nr"                  column="EBI_NR"
> > />
> > >             <field name="firma_namenszeile_1"   column="NAMENSZEILE_1"
> />
> > >             <field name="firma_namenszeile_2"   column="NAMENSZEILE_2"
> />
> > >             <field name="firma_namenszeile_3"   column="NAMENSZEILE_3"
> />
> > >
> > >             <entity name="firma_branche" child="true" query="
> > >                 SELECT b.EBC_CODE AS EBC_CODE
> > >                 FROM
> > >                     tb_firmen_branchen fb
> > >                         JOIN tb_branchen b ON fb.EBC_CODE = b.EBC_CODE
> > >                 WHERE fb.EBI_NR='${firma.firma_ebi_nr}'
> > >             ">
> > >                 <field name="branche_ebc_code" column="EBC_CODE" />
> > >                 <!-- I would like to add more fields later here once
> > > I get it to work -->
> > >             </entity>
> > >
> > >         </entity>
> > >     </document>
> > >     </dataConfig>
> > >
> > >
> > > And here are the relevant lines from my schema file:
> > >
> > >     <uniqueKey>firma_ebi_nr</uniqueKey>
> > >
> > >      <field name="firma_ebi_nr"                 type="long"
> > >  required="true"         indexed="true"  stored="true"/>
> > >      <field name="firma_namenszeile_1"  type="text_general"
> > >              indexed="true"  stored="true"/>
> > >      <field name="firma_namenszeile_2"  type="text_general"
> > >              indexed="true"  stored="true"/>
> > >      <field name="firma_namenszeile_3"  type="text_general"
> > >              indexed="true"  stored="true"/>
> > >      <field name="branche_ebc_code"             type="long"
> > >                      indexed="true"  stored="true"/>
> > >
> > >
> > >
> > > After restarting solr and calling
> > > http://localhost:8983/solr/jcg/dataimport?command=full-import I get
> > > "Indexing completed. Added/Updated: 9999 documents. Deleted 0
> documents."
> > > So basically it seams to work, but my search results look like this:
> > >
> > > {
> > >   "responseHeader":{
> > >     "status":0,
> > >     "QTime":71,
> > >     "params":{
> > >       "q":"Der Bunte",
> > >       "defType":"edismax",
> > >       "indent":"true",
> > >       "qf":"firma_namenszeile_1",
> > >       "wt":"json"}},
> > >   "response":{"numFound":85,"start":0,"docs":[
> > >       {
> > >         "firma_ebi_nr":123123123,
> > >         "firma_namenszeile_1":"Der Bunte Laden",
> > >         "_version_":1515185579421073408},
> > >       {
> > >      ...
> > > }
> > >
> > > Why are there no company branches inside the company records? What's
> > > wrong with my configuration? Any help is appreciated!
> > >
> > > Kind regards
> > > Matthias Fischer
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message