Return-Path: X-Original-To: apmail-spamassassin-users-archive@www.apache.org Delivered-To: apmail-spamassassin-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 534FD7690 for ; Fri, 5 Aug 2011 13:09:58 +0000 (UTC) Received: (qmail 14631 invoked by uid 500); 5 Aug 2011 13:09:55 -0000 Delivered-To: apmail-spamassassin-users-archive@spamassassin.apache.org Received: (qmail 14397 invoked by uid 500); 5 Aug 2011 13:09:54 -0000 Mailing-List: contact users-help@spamassassin.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: List-Id: Delivered-To: mailing list users@spamassassin.apache.org Received: (qmail 14386 invoked by uid 99); 5 Aug 2011 13:09:54 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Aug 2011 13:09:54 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lajo@kb.dk designates 130.226.229.20 as permitted sender) Received: from [130.226.229.20] (HELO post.kb.dk) (130.226.229.20) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Aug 2011 13:09:46 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by post.kb.dk (Postfix) with ESMTP id 0B4024170ED for ; Fri, 5 Aug 2011 15:09:25 +0200 (CEST) Received: from post.kb.dk ([127.0.0.1]) by localhost (post.kb.dk [127.0.0.1]) (amavisd-new, port 10024) with LMTP id VwKdNz9RvA6E for ; Fri, 5 Aug 2011 15:09:24 +0200 (CEST) Received: from EXCHANGE-01.kb.dk (exchange-01.kb.dk [130.226.220.130]) by post.kb.dk (Postfix) with ESMTP id E49BC4170EA for ; Fri, 5 Aug 2011 15:09:24 +0200 (CEST) Received: from EXCHANGE-02.kb.dk ([fe80::d47b:397f:4d5b:33e4]) by EXCHANGE-01.kb.dk ([fe80::f4ed:9e9f:7925:a8e2%17]) with mapi id 14.01.0289.001; Fri, 5 Aug 2011 15:09:25 +0200 From: =?iso-8859-1?Q?Lars_J=F8rgensen?= To: "users@spamassassin.apache.org" Subject: RE: Conversion Spamassassin(bayes) database to SDBM Thread-Topic: Conversion Spamassassin(bayes) database to SDBM Thread-Index: AQHMTdAGkC+b8TaGx0mBSB32osVp/5UC5o4AgAA8DwCAAAPRgIAACX2AgAAC+gCAAAQzAIAAAMWAgARK8ACAACR8gIAAUDMAgAACaACAAEv0gIAAHAcAgAQnzICAAbi5IA== Date: Fri, 5 Aug 2011 13:09:25 +0000 Message-ID: <6D2C830A0941EA40B6B483FE6EC98ADB1876B6ED@EXCHANGE-02.kb.dk> References: <32160172.post@talk.nabble.com> <4E327D34.20807@gmail.com> <32160722.post@talk.nabble.com> <4E32B2C9.7090005@gmail.com> <32160833.post@talk.nabble.com> <4E32BD3E.2090608@gmail.com> <32160907.post@talk.nabble.com> <4E32C169.7040707@gmail.com> <32167649.post@talk.nabble.com> <4E3679DF.1070309@gmail.com> <32170377.post@talk.nabble.com> <4E36BF2A.6080809@gmail.com> <32172509.post@talk.nabble.com> <1312233060.15738.306.camel@zappa.gregorie.org> <32194013.post@talk.nabble.com> In-Reply-To: <32194013.post@talk.nabble.com> Accept-Language: da-DK, en-US Content-Language: da-DK X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.6.0.18] Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 > Hello, thanks for the post. Firstly, you are wrong about performance of m= y > computer - I dont have supercomputer. I didnt run 10 000 000 messages > through spamc/spamd. In fact the number is 100 000 000 and it means the m= ax. > size of message I run through spamc/spamd(notice that the number is behin= d > -s parametr, s as SIZE). The result about 85 minutes is for about 17000 > messages (354MB). The average is 3,33 sec per message. That number seems pretty high. I'm not experienced enough in the general de= ployment of SA to say anything definite, but can only contribute numbers an= d hints from our own system. We use amavisd-new which doesn't spawn SA but = has it running all the time, thus saving lots of time in that area. Amavisd/postfix/SA can be configured to offer a lot of parallelism and can = thus take full advantage of available system resources. Currently we have 3= 2 parallel processes running on a rather small machine (2 cores, 3 GB RAM),= and our average per message is around 1.5 second. If you need to improve performance, I suggest you start looking at the mach= ine. Do you have a lot of iowait? Faster disks or look at dividing access b= etween multiple drives. Do you have swapping? More memory. Do you have cons= tant high cpu usage? More CPUs. Then start looking at the timing reports (I don't know if these are provide= d by SA or amavisd, so you might not have them in your setup). Each and eve= ry mail through the system has a timing report logged so you can see exactl= y how much time each step of the process took. It looks like this: Aug 5 00:01:53 post amavis[30559]: (30559-07) TIMING-SA total 1438 ms - pa= rse: 1.60 (0.1%), extract_message_metadata: 35 (2.5%), get_uri_detail_list:= 4 (0.3%), tests_pri_-1000: 13 (0.9%), tests_pri_-950: 1.54 (0.1%), tests_p= ri_-900: 1.55 (0.1%), tests_pri_-400: 33 (2.3%), check_bayes: 31 (2.2%), te= sts_pri_0: 1280 (89.0%), check_dkim_adsp: 109 (7.6%), check_spf: 40 (2.8%),= poll_dns_idle: 35 (2.4%), check_dcc: 525 (36.5%), check_razor2: 492 (34.2%= ), check_pyzor: 0.25 (0.0%), tests_pri_500: 28 (1.9%), learn: 23 (1.6%), ge= t_report: 1.45 (0.1%) Here you can see that check_dcc and check_razor2 are pretty expensive, beca= use they have to query external servers. We are a low traffic site (less th= an 50k messages a day) and that's not a problem for us. But if you have a h= igh volume of traffic and DNS lookup dependent tests takes a long time, you= might consider adding a local DNS server to your setup. Look at http://www= .spamtips.org/2011/07/spamassassin-why-run-your-own-dns.html for further in= formation. --=20 Lars