spamassassin-sysadmins mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Jones <da...@apache.org>
Subject Re: ruleqa user llanga
Date Sat, 11 Nov 2017 03:23:37 GMT
On 11/10/2017 09:01 AM, Merijn van den Kroonenberg wrote:
>>
>>>> Day 2 doesn't have that table with "mcviewing".  The next question is
>>>> what is causing this problem.  Is it related to new commits that throw
>>>> off the masscheck processing?
>>>
>>> The 2 days ago doesn't highlight a current masscheck....but still it
>>> shows
>>> a result at the bottom...so its showing *something*. I think its likely
>>> it
>>> is the masxcheck as present in the datrev input field:
>>> 20171108-r1814560-n
>>> But that one isn't in any daterev liting, not even in the full listing.
>>>
>>> So i think something in the ruleqa.cgi which builds the daterev list is
>>> broken and leaves out some masschecks.
>>> If I get the cachefile and the ddirectory listings I can go debug where
>>> things go pear-shaped.
>>>
>>
>> I have found one dubious piece of code where the masschecks are indexed
>> based on their svn rev number. But that is not an unique value has the
>> same revision  can be masschecked multiple times (by different
>> submitter/date).
> 
> I think this is in fact the case.
> There is something weird with masscheck user llanga.
> Either something is off with the timing of masscheck result submission or
> that user submits the masscheck result twice (once more the next day for
> the same revision).
> I think thats what triggers the bug in the ruleqa page.
> 
> ls -l html/20171108/r1814560-n/LOGS.all-*-llanga*
> 5356811 Nov 10 01:05 html/20171108/r1814560-n/LOGS.all-ham-llanga.log.gz
> 521798 Nov 10 01:06 html/20171108/r1814560-n/LOGS.all-spam-llanga.log.gz
> 
> ls -l html/20171109/r1814560-n/LOGS.all-*-llanga*
> 5356811 Nov 10 08:12 html/20171109/r1814560-n/LOGS.all-ham-llanga.log.gz
> 521798 Nov 10 08:12 html/20171109/r1814560-n/LOGS.all-spam-llanga.log.gz
> 
> b14039f7b3ef3329d6bbd80e8a2eb5e04eb62129
> html/20171108/r1814560-n/LOGS.all-ham-llanga.log.gz
> b14039f7b3ef3329d6bbd80e8a2eb5e04eb62129
> html/20171109/r1814560-n/LOGS.all-ham-llanga.log.gz
> 
> same checksum so same files.
> The question is, does the user do something wrong or is some scripting
> messed up (maybe related to bad timing or timezone issues).
> 
>>
>> Please see attached patch for masses/rulequa/ruleqa.cgi
> 
> I think i failed to attach patch correctly but send it directly to dave.
> 
>>
>> If this is not it then I suspect code around line 453 which trims some
>> revisions away. But its very hard to read code.
> 
> 

I think I figured out what was causing problems with the masscheck SVN 
revision getting thrown off by commits and llanga.  I was determining 
the $REVISION in masses/rule-update-score-gen/generate-new-scores.sh 
around line 123 by finding the newest SVN revision.  I thought the 
staging of the rsync dir and the SVN tagged versions would keep that SVN 
revision locked in for a 24 hour period.  Now I have updated the logic 
to find the SVN revision with the most occurrences in all of the corpus 
for that particular scoreset type.

It might work best if a tag file was dropped with the SVN revision by 
the run_nightly scrip that stages the masscheck area so the 
generate-new-scores.sh could be better matched to that SVN revision.  If 
an SVN command could be used to find the latest sa-update tagged 
version, then that could be used instead of a tag file.

--
Dave

Mime
View raw message