spamassassin-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Gessel <ges...@blackrosetech.com>
Subject Re: very basic SA-Learn performance question: is 90 seconds or so per token really, really slow or roughly normal?
Date Sat, 04 Nov 2017 13:09:02 GMT
so days later, still chunking away, not making much progress.

If I kill the process (doesn't stop sa-learn, just kills current script), it always returns

 ^Cplugin: eval failed: interrupted at /usr/local/bin/sa-learn line 511.            

which is

0509 sub killed {
0510   $spamtest->finish_learner();
0511   die "interrupted";
0512 }

The only difference in sa-learn I'm running from 3.4.1 at https://svn.apache.org/repos/asf/spamassassin/tags/spamassassin_release_3_4_1/

is line 50  
0050   $searchrelative = 1;    # disabled during "make install": REMOVEFORINST
(which I assume is removed given "REMOVEFORINST")

So I assume given the changes in lines 19-21, that my server is running 3.4.1 release.

I note that 3.4.2p3 has one difference from 3.4.1, which is comment out  use bytes; at line
21 (this has been there or not there a few times over various versions and so may be slightly
meaningful to something)

0021 # use bytes;

I'm not sufficiently perl savvy to have any idea whether that's useful to my performance issues
or not, but it an easy enough mod to try.

Any thoughts?

-David

-------- Original Message --------
Subject: Re: very basic SA-Learn performance question: is 90 seconds or so per token really,
really slow or roughly normal?
From: David Gessel <gessel@blackrosetech.com>
To: David Jones <djones@ena.com>, users@spamassassin.apache.org
Date: Thu Nov 02 2017 01:29:42 GMT+0300 (AST)

> Oh, I wiped the bayes data and started over already once, it isn't (or shouldn't be)
that big a deal.
> 
> Disk performance:  seems OK to me.  
> 
> # diskinfo -t /dev/aacd0
> /dev/aacd0
> 	512         	# sectorsize
> 	73295462400 	# mediasize in bytes (68G)
> 	143155200   	# mediasize in sectors
> 	0           	# stripesize
> 	0           	# stripeoffset
> 	8910        	# Cylinders according to firmware.
> 	255         	# Heads according to firmware.
> 	63          	# Sectors according to firmware.
> 	            	# Disk ident.
> 
> Seek times:
> 	Full stroke:	  250 iter in   2.966242 sec =   11.865 msec
> 	Half stroke:	  250 iter in   2.126653 sec =    8.507 msec
> 	Quarter stroke:	  500 iter in   3.616484 sec =    7.233 msec
> 	Short forward:	  400 iter in   1.540087 sec =    3.850 msec
> 	Short backward:	  400 iter in   1.104617 sec =    2.762 msec
> 	Seq outer:	 2048 iter in   0.546351 sec =    0.267 msec
> 	Seq inner:	 2048 iter in   0.726598 sec =    0.355 msec
> Transfer rates:
> 	outside:       102400 kbytes in   2.103472 sec =    48681 kbytes/sec
> 	middle:        102400 kbytes in   2.300709 sec =    44508 kbytes/sec
> 	inside:        102400 kbytes in   3.192841 sec =    32072 kbytes/sec
> 
> 
> nothing amazing, but nothing unexpectedly bad either.
> 
> -------- Original Message --------
> Subject: Re: very basic SA-Learn performance question: is 90 seconds or so per token
really, really slow or roughly normal?
> From: David Jones <djones@ena.com>
> To: users@spamassassin.apache.org
> Date: Thu Nov 02 2017 01:00:40 GMT+0300 (AST)
> 
>> If you want to try to keep your existing Bayes data, try dumping it to a backup file,
clear the DB, then restore it back to see if this resets things properly.  Hopefully this won't take weeks to dump.  :)
>>
>> https://wiki.apache.org/spamassassin/BayesMigration
>>
>> BTW, do you have normal file IO performance?  Have you checked iotop and iostats
to see what kind of IOPs/Mbps you are getting on your filesystem where the Bayes DB files are?

Mime
View raw message