lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kai Gülzau <kguel...@novomind.com>
Subject RE: document update / nested documents / document join
Date Mon, 17 Oct 2011 09:46:52 GMT
Nobody?

SOLR-139 seems to be the most popular issue but I don’t think this will be resolved in near
future (this year). Right?

So I will try SOLR-2272 as a workaround, split up my documents in "static" and " frequently
updated"
and join them at query time.

What is the exact join query to do a query like "category:bugfixes AND body:answer"
  matching "category:bugfixes" in doc1 and
  matching "body:answer" in doc3
  with just returning "doc 1"??

I adopted the fieldnames of
doc 3:
type: out
out_ticketid: 1001
out_body: this is my answer
out_category: other

q={!join+from=out_ticketid+to=ticketid}(category:bugfixes+OR+out_category:bugfixes)+AND+(body:answer+OR+out_body:answer)


Writing this, I doubt this syntax is even possible!?
Additionally I'm not sure if trunk with SOLR-2272 is "production ready".

The only way to do what I want in a released 3.x version is to do several searches and joining
the results manually.
e.g. 
q=category:bugfixes -> doc1 -> ticketid: 1001
q=body:answers -> doc3 -> ticket:1001
-> result ticketid:1001

This I way I would lose benefits like faceted search etc. :-\

Any suggestions?


Regards,

Kai Gülzau

-----Original Message-----
From: Kai Gülzau [mailto:kguelzau@novomind.com] 
Sent: Thursday, October 13, 2011 4:52 PM
To: solr-user@lucene.apache.org
Subject: document update / nested documents / document join

Hi *,

i am a bit confused about what is the best way to achieve my requirements.

We have a mail ticket system. A ticket is created when a mail is received by the system:

doc 1:
uid: 1001_in
ticketid: 1001
type: in
body: I have a problem
category: bugfixes
date: 201110131955

This incoming document is static. While the ticket is in progress there is another document
representing the current/last state of the ticket. Some fields of this document are updated
frequently:

doc 2:
uid: 1001_out
ticketid: 1001
type: out
body:
category: bugfixes
date: 201110132015

a bit later (doc 2 is deleted/updated):
doc 3:
uid: 1001_out
ticketid: 1001
type: out
body: this is my answer
category: other
date: 201110140915

I would like to do a boolean search spanning multiple documents like "category:bugfixes AND
body:answer".

I think it's the same what was proposed by:
http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-in-lucene

So I dig into the deeps of Lucene and Solr tickets and now i am stuck choosing the "right"
way:

https://issues.apache.org/jira/browse/LUCENE-2454 Nested Document query support
https://issues.apache.org/jira/browse/LUCENE-3171 BlockJoinQuery/Collector
https://issues.apache.org/jira/browse/LUCENE-1879 Parallel incremental indexing
https://issues.apache.org/jira/browse/SOLR-139 Support updateable/modifiable documents
https://issues.apache.org/jira/browse/SOLR-2272 Join


If it is easily possible to update one field in a document i would just merge the two logical
documents into one representing the whole ticket. But i can't see this is already possible.

SOLR-2272 seems to be the best solution by now but feels like workaround.
" I can't update a document field so i split it up in static and dynamic content and join
both at query time."

SOLR-2272 is committed to trunk/solr 4.
Are there any planned release dates for solr 4 or a possible backport for SOLR-2272 in 3.x?


I would appreciate any suggestions.

Regards,

Kai Gülzau





Mime
View raw message