lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sundeep T <sundeep....@gmail.com>
Subject Re: Trailing wild card searches very slow in Solr
Date Tue, 21 Nov 2017 01:26:15 GMT
Hi Erick,

Thanks for the reply. Here are more details on our setup -

*Setup/schema details -*

100 million doc solr core

String field (not tokenized) is docValues=true, indexed=true and stored=true

Field is almost unique in the index, around 80 million are unique

no commits on index

all caches disabled in solrconfig.xml

solr jvm heap 1GB

single solr core in jvm

solr core is not optimized and has about 50 segment files some up to 5GB

index size on disk is around 150GB

solr v6.5.0



*Performance -*


q=myfield:abc* has QTime=30secs+ first time

q=myfield:abc* has QTime=17-20secs after filecache on OS is primed


Thanks
Sundeep


On Mon, Nov 20, 2017 at 12:16 PM, Erick Erickson <erickerickson@gmail.com>
wrote:

> Well, define "slow". Conceptually a large OR clause is created that
> contains all the terms that start with the indicated text. (actually a
> PrefixQuery should be formed).
>
> That said, I'd expect hello* to be reasonably fast as not many terms
> _probably_ start with 'hello'. Not the same at all for, say, h*.
>
> You might review: https://wiki.apache.org/solr/UsingMailingLists,
> you're not really providing much information to go on here.
>
> What is the result of adding &debug=query? Particularly it would be
> useful to see the parsed query.
>
> Are all such queries slow? What happens if you submit hel* followed by
> hello*, the first one will bring the underlying index structures into
> memory, for all we know this could simply be an autowarming issue.
>
> Are you indexing at the same time? Do you have a short autocommit interval?
>
> What version of Solr?
>
> Details matter.
> Best,
> Erick
>
> On Mon, Nov 20, 2017 at 11:50 AM, Sundeep T <sundeep.t16@gmail.com> wrote:
> > Hi Erick.
> >
> > I initially asked this question regarding leading wildcards. This was a
> > typo, and what I meant was trailing wild card queries were slow. So
> queries
> > like text:'hello*" are slow. We were expecting since the string field is
> > already indexed, the searches should be fast, but that seems to be not
> the
> > case
> >
> > Thanks
> > Sundeep
> >
> > On Mon, Nov 20, 2017 at 9:39 AM, Erick Erickson <erickerickson@gmail.com
> >
> > wrote:
> >
> >> You already asked that question and got several answers, did you not
> >> see them? If you did see them, what is unclear?
> >>
> >> Best,
> >> Erick
> >>
> >> On Mon, Nov 20, 2017 at 9:33 AM, Sundeep T <sundeep.t16@gmail.com>
> wrote:
> >> > Hi,
> >> >
> >> > We have several indexed string fields which is not tokenized and does
> not
> >> > have docValues enabled.
> >> >
> >> > When we do trailing wildcard searches on these fields they are running
> >> very
> >> > slow. We were thinking that since this field is indexed, such queries
> >> > should be running pretty quickly. We are using Solr 6.6.1. Anyone has
> >> ideas
> >> > on why these queries are running slow and if there are any ways to
> speed
> >> > them up?
> >> >
> >> > Thanks
> >> > Sundeep
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message