lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven White <swhite4...@gmail.com>
Subject Re: When is too many fields in "qf" is too many?
Date Wed, 20 May 2015 16:20:35 GMT
> Also, is this 1500 fields that are always populated, or are there really a
> larger number of different record types, each with a relatively small
> number of fields populated in a particular document?

Answer: This is a large number of different record types, each with a
relatively small number of fields in a particular document.  Some documents
will have 5 fields, others may have 50 (that's the average)

> Could you try to point to a real-world example of where your use case
might
> apply, so we can relate to it?

I'm indexing data off a DB, all the fields of each record is indexed.  The
application is complex such that it has "views" and users belong to 1 or
more views.  Users can move between views and views can change over time.
A user in view-A can see certain fields, while a user in view-B can see
some other fields.  So, when a user issues a search, I have to limit into
which fields that search is executed against.  And like I said, because
users can move between views, and views can change over time, the list of
fields isn't static.  This is why I have to pass the list of fields for
each search based on user's current "view".

I hope this gives context to my problem I'm trying to solve and describes
why I'm using "fq" and why the list of fields maybe long because there is a
case in which a user may belong to N - 1 views.

Steve


On Wed, May 20, 2015 at 11:14 AM, Jack Krupansky <jack.krupansky@gmail.com>
wrote:

> The uf parameter is used to specify which fields a user "may" query against
> - the "qf" parameter specifies the set of fields that an unfielded query
> term "must" be queried against. The user is free to specify fielded query
> terms, like "field1:term1 OR field2:term2". So, which use case are you
> really talking about.
>
> Could you try to point to a real-world example of where your use case might
> apply, so we can relate to it?
>
> Generally, I would say that a Solr document/collection should have no more
> than "low hundreds" of fields. It's not that you absolutely can't have more
> or absolutely can't have 5,000 or more, but simply that you will be asking
> for trouble, for example, with the cost of comprehending and maintaining
> and communicating your solution with others, including this mailing list
> for support.
>
> What specifically pushed you to have documents with 1500 field?
>
> Also, is this 1500 fields that are always populated, or are there really a
> larger number of different record types, each with a relatively small
> number of fields populated in a particular document?
>
>
> -- Jack Krupansky
>
> On Wed, May 20, 2015 at 8:27 AM, Steven White <swhite4141@gmail.com>
> wrote:
>
> > Hi everyone,
> >
> > My solution requires that users in group-A can only search against a set
> of
> > fields-A and users in group-B can only search against a set of fields-B,
> > etc.  There can be several groups, as many as 100 even more.  To meet
> this
> > need, I build my search by passing in the list of fields via "qf".  What
> > goes into "qf" can be large: as many as 1500 fields and each field name
> > averages 15 characters long, in effect the data passed via "qf" will be
> > over 20K characters.
> >
> > Given the above, beside the fact that a search for "apple" translating
> to a
> > 20K characters passing over the network, what else within Solr and
> Lucene I
> > should be worried about if any?  Will I hit some kind of a limit?  Will
> > each search now require more CPU cycles?  Memory?  Etc.
> >
> > If the network traffic becomes an issue, my alternative solution is to
> > create a /select handler for each group and in that handler list the
> fields
> > under "qf".
> >
> > I have considered creating pseudo-fields for each group and then use
> > copyField into that group.  During search, I than can "qf" against that
> one
> > field.  Unfortunately, this is not ideal for my solution because the
> fields
> > that go into each group dynamically change (at least once a month) and
> when
> > they do change, I have to re-index everything (this I have to avoid) to
> > sync that group-field.
> >
> > I'm using "qf" with edismax and my Solr version is 5.1.
> >
> > Thanks
> >
> > Steve
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message