lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: check softCommit , autocommit and hard commit count
Date Mon, 04 Dec 2017 16:25:33 GMT
Neither commit does anything if no updates have been received.

But you don't need to wait for the devs to STOP DOING THAT ;). In
solrconfig.xml you can set:
IgnoreCommitOptimizeUpdateProcessorFactory
see the ref guide....

Best,
Erick

On Mon, Dec 4, 2017 at 12:53 AM, Puppy Linux Distros <vivekcv1@gmail.com> wrote:
> Hi,
>
> Thanks Shawn for the help.
>
> I think I should have added few more details to my previous mail.
>
> I know it's a bad practice but due to some reasons, our application fires
> hard commits via code(upon most of the /update) and invokes the /update api
> with commit=true and application very less uses softcommits. I will
> recommend devs to look forward with more softcommits and make use of
> realtime searchers in future.
>
> However, my current scenario is to get the solr to latest 7.1.0 so I need
> to collect the current traffic in solr to have an optimized trade-offs with
> the latest stack that I am looking forward to. Current stack is bit older
> like 4.10. so got to process/parse current solr logs.
>
> I have my own log storage mechanism with us so I have one month solr.log
> stored and hence rotation/archive isn't an issue here. Once I get a hold of
> unique phrases in each logs that appends with each type of
> commits(softcommit, autohardcommit,hardcommit), I can frame some metrics of
> current traffic.
>
> Our current stack still maintains default autocommit config like
> opensearcher=false and 15s period. Currently dont have softcommits enabled,
> however softcommits and hardcommits invokes explicitly from application,
> hence its bit hard to get them separated from solr.log unless I get some
> unique phrases/regex/words out of each log lines that each type of commits
> fires. Would be really helpful if any inputs in this area.
>
> In addition to that, just wanted to confirm, if there no pending /update
> written to disk, does autocommit really fires at it's interval or is it
> going to be idle if nothing to write to disk..? In other way, suppose, I
> made a softcommit on 5th second and I made a hardcommit explicitly on 10th
> second, is it really going to happen an autocommit on 15th second for no
> reason since hardcommit on 10th second has already wrote the changes to
> disk and re-built the index. If it happens in that way, it makes sense to
> me if I see very less autocommit logs since I have very frequent
> hardcommits firing from the application.
>
> Every help is appreciated.
> Thanks in advance,
>
> On Mon, Dec 4, 2017 at 10:51 AM, Puppy Linux Distros <vivekcv1@gmail.com>
> wrote:
>
>> Hello,
>>
>> Thanks Shawn. Can you provide command to find the total number of
>> autocommits in the solr.log?
>>
>> On Thu, Nov 30, 2017 at 7:20 PM, Shawn Heisey <apache@elyograg.org> wrote:
>>
>>> On 11/30/2017 4:36 AM, Puppy Linux Distros wrote:
>>>
>>>> I am trying to calculate the total number of softCommit , autocommit and
>>>> hard commit from the solr logs. Can you please check whether the below
>>>> commands are correct ?
>>>>
>>>> Let me know how to find the total softcommit, hardcommit and autocommit
>>>> from the logs.
>>>>
>>>>
>>>> *1. totalcommit=`cat $solrlogfile | grep "start commit" | wc -l`*
>>>>
>>>> *totalcommit =  **41906*
>>>>
>>>>
>>>> *2. totalsoftcommit=`cat $solrlogfile | grep "start commit" | grep
>>>> "softCommit=true" | wc -l`*
>>>>
>>>> *totalsoftcommit =  **921*
>>>>
>>>
>>> These look reasonable ... but be aware that the default logging config
>>> will roll the solr.log file to a new empty file when it reaches 4
>>> megabytes, which doesn't really take that long on a busy server, so if
>>> you're only looking at "solr.log" you may have an incomplete picture.  I
>>> personally change the roll size limit to 4 gigabytes so solr.log covers a
>>> lot more time.
>>>
>>> Solr restarts will *also* roll/archive logfiles, so you probably can't
>>> just look through every file in the logs directory that starts with
>>> "solr.log" -- it may be difficult to figure out exactly which files apply
>>> to the current running instance.  It might turn out that I'm completely
>>> wrong in that statement -- I haven't confirmed exactly what a Solr restart
>>> actually does with the logfiles.
>>>
>>> *3. totalhardcommits=`cat $solrlogfile | grep "start commit" | grep
>>>> "softCommit=false" | grep "openSearcher=true" | wc -l`*
>>>>
>>>> *totalhardcommits=  **40982*
>>>>
>>>
>>> If you have configured autoCommit in solrconfig.xml and have set
>>> openSearcher to false in that config, then there will be hard commits that
>>> *don't* open a new searcher, so the "openSearcher=true" part will not catch
>>> those commits.  Example configs in recent versions have autoCommit set up
>>> this way, and this recommended config for *everybody*.  The default
>>> autoCommit interval in the example configs is 15 seconds, which I think is
>>> a little too aggressive, but this kind of commit is typically very fast, so
>>> I've never seen that config cause problems.
>>>
>>> The example configs do not have autoSoftCommit configured.  If users want
>>> to automatically do commits for visibility, we recommend that they use
>>> autoSoftCommit.
>>>
>>> *4.  totalautocommit=`cat $solrlogfile | grep "realtime" | wc -l`*
>>>>
>>>> *totalautocommit= 3*
>>>>
>>>
>>> These aren't autoCommits.  They are new searchers for the realtime get
>>> handler, which is capable of accessing documents that haven't been
>>> committed yet.  In addition to the index on disk, it searches the
>>> transaction logs.  Opening a new realtime searcher should be very fast, and
>>> they happen without any configuration. I'm not sure why you're only seeing
>>> this happen three times here. Presumably in a log where there are 40000
>>> total commits, you are doing a fair amount of indexing, so I would have
>>> expected a new realtime searcher to have been created much more frequently,
>>> even if there were no commits done at all.
>>>
>>> Maybe the realtime get handler can use the standard searcher, and only
>>> opens a new realtime searcher in cases where new documents have been
>>> indexed but there hasn't been a recent commit that opens a new searcher.
>>> If that's the case, then I have no idea how long it would wait before
>>> firing up a new realtime searcher.  I wouldn't expect that to be very long
>>> ... so if your indexing/committing cycles are normally very fast, maybe
>>> Solr doesn't feel it's necessary to open realtime searchers very often.
>>>
>>> Thanks,
>>> Shawn
>>>
>>>
>>
>>
>> --
>> Regards,
>>
>> Vivek CV
>>
>>
>>
>
>
> --
> Regards,
>
> Vivek CV

Mime
View raw message