lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jak Akdemir <jakde...@gmail.com>
Subject Re: Solr Near Real-Time Search, Soft Commit Problem
Date Thu, 17 Nov 2011 16:48:52 GMT
Eric,

Thank you for your response,

1) I tried 2 new records (records have only 5 field in one table) per
second, in 6 sec interval too. It should be quite  easy for mysql. But I
will check query responses per second as you suggested.

2) I am sure about delta-queries configured well. Full-Import is completed
in 40 secs for 400000 docs. And delta's are in 1 sec for 15 new records.
Also I checked it. There is no problem in it.

Couple of evidences that drove me to think this is a configuration problem
are
1- Index files are changing every second.
2- After a server restart last query results reserved. (In NRT they would
disappear, right?)

Please correct me if you see any problem in steps I applied for NRT.

Additional specs,
32 bit OS
4 core i7-2630QM CPU @ 2.00GHz
6 GB memory

Bests,

Jak

On Thu, Nov 17, 2011 at 10:44 AM, Erick Erickson <erickerickson@gmail.com>wrote:

> I guess my first question is what evidence you have that Solr is
> unable to index fast enough? It's quite possible that your
> database connection is the thing that's unable to process fast
> enough.
>
> That's certainly a guess, but unless your documents are
> quite complex, 15 records/second isn't likely to cause Solr
> problems. You might try to run a small Java program that
> executes your database queries and see.
>
> The other question I'd ask is if you're absolutely sure that
> your delta-import query is correct? Is it possible that you're
> re-indexing *everything* every time? There's an interactive
> debugging console you can use that may help, try:
> http://localhost:8983/solr/admin/dataimport.jsp
>
> Best
> Erick
>
> On Thu, Nov 17, 2011 at 3:19 AM, Jak Akdemir <jakdemir@gmail.com> wrote:
> > Hi,
> >
> > I was trying to configure a Solr instance with the near real-time search
> > and auto-complete capabilities. I stuck in the NRT feature. There are
> > 15 new records per second that inserted into the database (mysql) and I
> > indexed them with DIH. First, I tried to manage autoCommits from
> > solrconfig.xml with the configuration below.
> >
> > <autoCommit>
> >         <maxDocs>10000</maxDocs>
> >         <maxTime>100000</maxTime>
> >       </autoCommit>
> >
> > <autoSoftCommit>
> >         <maxDocs>15</maxDocs>
> >         <maxTime>1000</maxTime>
> > </autoSoftCommit>
> >
> > And the bash script below responsible for getting delta's without
> > committing.
> >
> > while [ 1 ]; do
> > wget -O /dev/null '
> >
> http://localhost:8080/solr-jak/dataimport?command=delta-import&commit=false
> '
> > 2>/dev/null
> > sleep 1
> > done
> >
> > Then I run my query from browser
> > http://localhost:8080/solr-jak/select?q=movie_name_prefix_full
> :"dogville"&defType=lucene&q.op=OR<
> http://localhost:8080/solr-sprongo/select?q=movie_name_prefix_full:%221398%22&defType=lucene&q.op=OR
> >
> >
> > But I realized that, with this configuration index files are changing
> every
> > second and after a minute there are only 600 new records in Solr index
> > while 900 new records in the database.
> > After experienced that, I removed autoCommit and autoSoftCommit elements
> in
> > solrconfig.xml And updated my bashscript as follows. But still index
> files
> > are changing and solr can not syncronized with database.
> >
> > while [ 1 ]; do
> > echo "Soft commit applied!"
> > wget -O /dev/null '
> >
> http://localhost:8080/solr-jak/dataimport?command=delta-import&commit=false
> '
> > 2>/dev/null
> > curl http://localhost:8080/solr-jak/update -H "Content-Type: text/xml"
> > --data-binary '<commit softCommit="true" waitFlush="false"
> > waitSearcher="false"/>' 2>/dev/null
> > sleep 3
> > done
> >
> > Even I decreased the pressure on Solr as 1 new record per sec. and soft
> > commits within 6 sec. still there is a gap between index and db. Is there
> > anything that I missed? I took a look to "/get" too, but it is working
> only
> > for pk. If there is an example configuration list (like 1 sec for soft
> > commit and 10 min for hard commit) as a best practice it would be great.
> >
> > Finally, here is my configuration.
> > Ubuntu 11.04
> > JDK 1.6.0_27
> > Tomcat 7.0.21
> > Solr 4.0 2011-10-24_08-53-02
> >
> > All advices are appreciated,
> >
> > Best Regards,
> >
> > Jak
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message