lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pratik Patel <pra...@semandex.net>
Subject Re: Using fetch function with streaming expression
Date Tue, 14 Mar 2017 23:53:43 GMT
Wow, this is interesting! Is it going to be a new addition to solr or is it
already available cause I can not find it in documentation? I am using solr
version 6.4.1.

On Tue, Mar 14, 2017 at 7:41 PM, Joel Bernstein <joelsolr@gmail.com> wrote:

> I'm going to add a "cartesian" function that create a cartesian product
> from a multi-value field. This will turn a single tuple with a multi-value
> into multiple tuples with a single value field. This will allow the fetch
> operation to work on ancestors. It also has many other use cases. Sample
> syntax:
>
> fetch(collection1,
>          cartesian(field=ancestors,
>                          having(gatherNodes(collection1,
>
>  search(collection1,
>
>  q="*:*",
>
>  fl="conceptid",
>
>  sort="conceptid asc",
>
>  fq=storeid:"524efcfd505637004b1f6f24",
>
>  fq=tags:"Company",
>
>  fq=tags:"Prospects2",
>
>  qt="/export"),
>
> walk=conceptid->eventParticipantID,
>
> gather="eventID",
>                                           t
> rackTraversal="true",
>
> scatter="leaves",
>                                                             count(*)),
>                                      gt(count(*),1))),
>          fl="concept_name",
>          on="ancestors=conceptid")
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Tue, Mar 14, 2017 at 11:51 AM, Pratik Patel <pratik@semandex.net>
> wrote:
>
> > Hi, Joel. Thanks for the reply.
> >
> > So, I need to do some graph traversal queries for my use case. In my data
> > set, I have concepts and events.
> >
> > concept : {name, address, bio ......},
> > > event: {name, date, participantIds:[concept1, concept2...] .....}
> >
> >
> > Events connects two or more concepts. So, this is a graph data where
> > concepts are connected to each other via events. Each event store links
> to
> > the concepts that it connects. So the field which stores those links is
> > multivalued. This is a natural structure for my data on which I wanted to
> > do some advanced graph traversal queries with some streaming expression.
> > However, gatherNodes() function does not support multivalued fields yet.
> > So, I changed my index structure to be something like this.
> >
> > concept : {conceptId, name, address, bio ......},
> > > event: {eventId, name, date, participantIds:[concept1, concept2...]
> > .....}
> > > *****create eventLink documents for each participantId in each
> > > event********
> > > eventLink:{eventid, conceptid, id}
> >
> >
> >
> > I created eventLink documents from each event so that I can traverse the
> > data using gatherNodes() function. With this change, I was able to do
> graph
> > query and get Ids of concepts which I wanted. However, I only have ids of
> > concepts. Now, using these ids, I want additional data from concept
> > documents like concept_name or address or bio.  This is what I was trying
> > to achieve with fetch() function but it seems I hit the multivalued
> > limitation again :) The reason why I am storing only the ids in eventLink
> > documents is because I don't want to duplicate data unnecessarily. It
> will
> > complicate maintenance of consistency in index when delete/update
> happens.
> > Is there any way I can achieve this?
> >
> > Thanks!
> > Pratik
> >
> >
> >
> >
> >
> > On Tue, Mar 14, 2017 at 11:24 AM, Joel Bernstein <joelsolr@gmail.com>
> > wrote:
> >
> > > Wow that's an interesting expression!
> > >
> > > The problem is that you are trying to fetch using the ancestors field,
> > > which is multi-valued. fetch doesn't support multi-value join keys. I
> > never
> > > thought someone might try to do that.
> > >
> > > So , your attempting to get the concept names for ancestors?
> > >
> > > Can you explain a little more about the use case?
> > >
> > >
> > > Joel Bernstein
> > > http://joelsolr.blogspot.com/
> > >
> > > On Tue, Mar 14, 2017 at 11:08 AM, Pratik Patel <pratik@semandex.net>
> > > wrote:
> > >
> > > > I have two types of documents in my index. eventLink and
> concepttData.
> > > >
> > > > eventLink ---- { ancestors:[<id1>,<id2>] }
> > > > conceptData-----{ id:id1, conceptid, concept_name .....<some more
> > data> }
> > > >
> > > > Both are in same collection.
> > > > In my query, I am doing a gatherNodes query wrapped in some other
> > > function
> > > > and ultimately I am getting a bunch of eventLink documents. Now, I am
> > > > trying to get conceptData document for each id specified in
> eventLink's
> > > > ancestors field. I am trying to do that using fetch() function. Here
> is
> > > > simplified form of my query.
> > > >
> > > > fetch(collection1,
> > > > >  function to get eventLinks,
> > > > >   fl="concept_name",
> > > > >   on="ancestors=conceptid"
> > > > > )
> > > >
> > > >
> > > > On executing this query, I am getting back same set of documents
> which
> > > are
> > > > results of my streaming expression containing gatherNodes() function.
> > No
> > > > fields are added to the tuples. From documentation, it seems like
> fetch
> > > > would fetch additional data and add it to the tuples. However, that
> is
> > > not
> > > > happening. Resulting tuples does not have concept_name field in them.
> > > What
> > > > am I missing here? I really need to get this additional data from one
> > > solr
> > > > query so that I don't have to iterate over the eventLinks and get
> > > > additional data by individual queries. That would badly impact
> > > performance.
> > > > Any suggestions?
> > > >
> > > > Here is my actual query and the response.
> > > >
> > > >
> > > > fetch(collection1,
> > > > >  having(
> > > > > gatherNodes(collection1,
> > > > > search(collection1,q="*:*",fl="conceptid",sort="conceptid
> > > > > asc",fq=storeid:"524efcfd505637004b1f6f24",fq=
> > tags:"Company",fq=tags:"
> > > > Prospects2",
> > > > > qt="/export"),
> > > > > walk=conceptid->eventParticipantID,
> > > > > gather="eventID",
> > > > > trackTraversal="true", scatter="leaves",
> > > > > count(*)
> > > > > ),
> > > > > gt(count(*),1)
> > > > > ),
> > > > > fl="concept_name",
> > > > > on="ancestors=conceptid"
> > > > > )
> > > >
> > > >
> > > >
> > > > Response :
> > > >
> > > > {
> > > > > "result-set": {
> > > > > "docs": [
> > > > > {
> > > > > "node": "524f03355056c8b53b4ed199",
> > > > > "field": "eventID",
> > > > > "level": 1,
> > > > > "count(*)": 2,
> > > > > "collection": "collection1",
> > > > > "ancestors": [
> > > > > "524f02845056c8b53b4e9871",
> > > > > "524f02755056c8b53b4e9269"
> > > > > ]
> > > > > },
> > > > > .........
> > > > > }
> > > >
> > > >
> > > > Thanks,
> > > > Pratik
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message