spamassassin-sysadmins mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin A. McGrail" <kevin.mcgr...@mcgrail.com>
Subject Notes about Trap - Re: colo/incoming.spamassassin.org server
Date Tue, 16 Jan 2018 15:00:35 GMT
Joe and Dave + SASA:

I have combed my notes.  Here's what I have which I have removed some 
password info on but I think it can help rebuild the process.  Dave, can 
you take a look?  I can get you passwords. Talon1 still exists, 
spamassassin-vm does not, I have a backup of spamassassin-vm from 3.7 
months ago.

Regards,
KAM

#1 - Some boxes are just names for other boxes
trap-proc.spamassassin.org. Sonic has scripts set up to archive 
collected spam to that server.



#2 - My notes from spamassassin-vm.apache.org that catastrophically died:

this was the traps cron that needs to be added on spamassassin-vm

20 2 * * * rsync -rze ssh --whole-file --size-only --delete 
jm@trap-proc.spamassassin.org 
<mailto:jm@trap-proc.spamassassin.org>:/home/jm/cor/. 
/export/home/bbmass/uploadedcorpora/traps/.

DONE - add this traps account
DONE - fix perms for /export/home/bbmass/uploadedcorpora/traps/
DONE - add cron job


#3 - From April 2017

Let me know if you are not the correct person to talk to about this, but
we are having issues reaching trap-proc.spamassassin.org. It looks like
we have some scripts set up to archive collected spam to that server,
and I haven't seen a successful connection for a few days now.

-- 
Grant Keller
System Operations
grant.keller@sonic.com <mailto:grant.keller@sonic.com>


#4 - from 2014


The box at Sonic is the backend for the SpamAssassin spamtraps feed.  To 
be honest, I am not sure anyone or anything is consuming the collected 
data at this stage -- it should probably be shut down, unless someone 
wants to take it over?


incoming.spamassassin.org : this is the spamtrap machine at Sonic. 
  Basically, qpsmtpd
handles the incoming SMTP traffic, handing it off via a Gearman queue to 
"gears" -- a
set of scripts running in the background which filter out noise, crap, 
bounces, etc.
then buffer them to mbox files and upload.

/home/trap contains the code, /home/trapper is the output files. 
  /etc/init.d/gears starts the
scripts which compose it, copying them to /tmpfs first so they don't hit 
the disk where possible,
for speed.

The main config file is at /home/trap/code/gears/config .

The buffered mbox files are then uploaded to my S3 account, using an IAM 
credential which can only
access one single bucket called "mailtrap".  After 1 day those files are 
auto-expired.

This stuff all appears to be working ok, although the volume is pretty 
high (and I suspect
it's costing me a fair bit of money even despite the auto-expiration!)


Next step is spamassassin2.zones.apache.org, which has an alias of 
"trap-proc.spamassassin.org"
in DNS.  A cron on my user account runs 
"/home/trapscripts/copy_to_corpus" which
(at least at some point) appears to have selected a randomised subset of 
uploaded spam corpora
into /home/jm/cor/spam and /home/jm/cor/nonspam.  Those directories are 
now empty, so
I think this part may have broken at some point in 2013 :(

I can't track down the script which downloads files from the S3 account, 
annoyingly!

Again, everything there runs as "jm".


Finally there is talon1. The host is talon1.pccc.com; username "jm", 
password is in
spamassassin2.zones.apache.org/root/sought_rules_info.txt (readable only by
root).

That host is being used to generate the SOUGHT rulesets, and as far as I can
see (apologies, I haven't been monitoring it at all recently!) it still 
seems
to be doing so. It all runs from the "jm" user account, every 4 hours from
cron; see "crontab -l".  Part of the process is to rsync-over-ssh the 
ham and spam
corpus from jm@spamassassin2.


Then the final step of that script is to publish the files to my server 
at taint.org,
by "svn commit"ing in ~/sought on talon1.  That directory commits back to an
svn repo on that host over svn+ssh, then SSHes to that host and runs a 
script;
that generates GPG signatures, updates the DNS records and pushes it to the
Cloudfront/S3 bucket for rules.yerp.org. If/when you guys take this 
over, this
bit definitely needs to be moved to another host and account, since that's
my main personal server ;)

Having said that, I'm happy to hand over the credentials to all the 
other "jm" accounts
named above.  I've put the passwords into
spamassassin2.zones.apache.org/root/sought_rules_info.txt (readable only 
by root).
Feel free to take over those accounts and do what you will with them ;)

Sorry I haven't handed this over earlier -- even reverse-engineering all 
this took
quite a lot of effort.  Legacy systems suck!

The trap data comes from:   It's partly a typical spamtrapping MX 
capturing dead domains, and partly /etc/aliases forwards from other ISPs 
around the world, who are following the "how to donate your spamtrap to 
SA" instructions on the wiki.  Note that the latter means that we have 
to do a bunch of stripping off forwarding steps when/if we act on that data.

the domains are  MX records hanging off existing,
live domains; e.g. I'd add a "mx.taint.org", seed a few email addresses in
those domains eg for web scrapers, then MX the entire domain to the traps
machine.

 > - the alias forwards: where they pointing to?
Essentially there's a *@incoming.spamassassin.org 
<mailto:*@incoming.spamassassin.org> catch-all, and the alias
forwards redirect spam into named addresses there.


sought_rules_info.txt:

jm@talon1 password: <removed>

incoming.spamassassin.org = traps machine:  u root   p <removed >
                 u jm     p <removed>

jm account on zones2: <removed>

Doing about 74000 messages per day as of 2/5/2014

DONE - WORKS AS of 4/23 GOING TO 76.191.162.2 1 - Get SSH access working 
- Pinged Justin on 4/22

DONE - 2 - why does incoming.spamassassin.org have two IPs? - Emailed Justin

incoming.spamassassin.org. 3507 IN      A       76.191.162.2
incoming.spamassassin.org. 3507 IN      A       75.101.166.134

Huh. I had no idea we were still doing that ;)  That is the Mailchannels 
spamtrap IP.  If you remember back in 2008 (private@ was cc'd), they 
donated spamtrap hosting to us, in exchange for spam data.  We 
eventually moved off the donated spamtrap server (in EC2) which they 
were paying for, to the current one in PCCC. it looks like we never 
changed the 50:50 split setup though on the MX record (and I'd forgotten 
about it).  I think we can probably turn that off now….

76.191.162.2 is our one.  I've just verified that I'm able to SSH to it 
as root.

DONE - 2a - Remove 75.101.166.134 from incoming.spamassassin.org. DNS entry

3 - more?
On 1/15/2018 11:49 AM, Dave Jones wrote:
> No problem.  No rush.  Just didn't hear from you so I thought you 
> might have missed the last email from Joe.
>
> There's no rsyncd running or listening on port 873 on that box if the 
> rsync's are supposed to be pushing to it.
>
>
> [root@colo etc]# netstat -tunlap | grep LISTEN
> tcp        0      0 0.0.0.0:35469               0.0.0.0:* LISTEN     

> 9693/perl
> tcp        0      0 127.0.0.1:4243              0.0.0.0:* LISTEN     

> 20805/java
> tcp        0      0 127.0.0.1:53                0.0.0.0:*
LISTEN      
> 2175/named
> tcp        0      0 0.0.0.0:22                  0.0.0.0:*
LISTEN      
> 2306/sshd
> tcp        0      0 127.0.0.1:25                0.0.0.0:*
LISTEN      
> 2389/master
> tcp        0      0 127.0.0.1:953               0.0.0.0:* LISTEN     

> 2175/named
> tcp        0      0 0.0.0.0:7003                0.0.0.0:*
LISTEN      
> 9693/perl
> tcp        0      0 :::8000                     :::*
LISTEN      
> 757/httpd
> tcp        0      0 ::1:53                      :::*
LISTEN      
> 2175/named
> tcp        0      0 :::22                       :::*
LISTEN      
> 2306/sshd
> tcp        0      0 ::1:953                     :::*
LISTEN      
> 2175/named
>
>
> I didn't find any cron jobs scheduled that would be 
> pulling/transferring via rsync either.
>
> I aliased my email address to root's on that box so I would get all 
> emails and so far just a few cron jobs with minor issues.
>
> I searched the /home/trap directory for any "signs of life" since that 
> seems to be the main thing setup/running on this box for gearmand. 
> Nothing found in the logs.
>
> Dave
>
>
> On 01/15/2018 10:27 AM, Kevin A. McGrail wrote:
>> Please give me two more days.  There are some DNS issues I'm 
>> researching around trap-proc and some old notes for justin.
>>
>> I think the system is broken because trap-proc should be a cname for 
>> the colo box.
>>
>> There should be rsyncs and traps happening.
>>
>> I am traveling for business and have not had the time I hoped this week.
>>
>> On 1/15/2018 11:24 AM, Dave Jones wrote:
>>> Kevin,
>>>
>>> Are you OK with shutting down this colo box?  I didn't find anything 
>>> running on this box anymore.
>>>
>>> Dave
>>>
>>>
>>> On 01/11/2018 02:01 PM, Joe Muller wrote:
>>>> Kevin, are you okay with shutting down the old server today? It sounds
>>>> like Dave has finished migrating services.
>>>>
>>>> Also, I'll be looking through our capture scripts to make sure they're
>>>> functioning. Expect an email in the next couple days with my 
>>>> findings. :)
>>>>
>>>> -- Joe
>>>>
>>>>
>>>> On 01/11/2018 10:40 AM, Dave Jones wrote:
>>>>> On 01/08/2018 06:56 AM, Kevin A. McGrail wrote:
>>>>>> Hi Joe.
>>>>>>
>>>>>> Great to hear about ns b back online and thanks about the machine.
>>>>>>
>>>>>> Dave has really been leading the effort about the machine. We have
a
>>>>>> mirror running on it now.  We're still getting some information

>>>>>> about
>>>>>> the old server but it looks awesome.
>>>>>>
>>>>>> Out of interest, do you have any documentation on the capture 
>>>>>> scripts
>>>>>> running at Sonic?  We are trying to really improve our documentation
>>>>>> on systems.
>>>>>>
>>>>>> Regards,
>>>>>> KAM
>>>>>>
>>>>>
>>>>> I have scoured this old colo box for the past week and don't see it
>>>>> doing really anything.  It has a local gearmand running to process
>>>>> queues but it's logs don't show any work happening.
>>>>>
>>>>> I added my email address to the root alias and all I see is a minimal
>>>>> logwatch email and some unimportant minor errors from cron output.
>>>>>
>>>>> The local account "trapper" is full of undeliverable email with this:
>>>>>
>>>>> trap-proc.spamassassin.org[192.87.106.247]: Connection timed out
>>>>>
>>>>> It has an Apache webserver running on port 8000 but that appears to
>>>>> only be for Munin reports.
>>>>>
>>>>> It has a local BIND DNS server only listening on 127.0.0.1:53 and
>>>>> 127.0.0.1:953.
>>>>>
>>>>> Unless anyone knows something more, I think the old server can be
>>>>> shutdown and given a few days before being pulled.
>>>>>
>>>>> Dave
>>>>>
>>>>>
>>>>>> On 1/6/2018 11:37 PM, Joe Muller wrote:
>>>>>>> Kevin,
>>>>>>>
>>>>>>>     Thanks for letting me know - b.auth-ns was offline for
an
>>>>>>> emergency hardware swap. Unfortunately, spinning up a fresh OS
and
>>>>>>> the associated DNS backend took longer than expected (I largely
>>>>>>> blame myself) - we are back up to fully operating status as of
>>>>>>> 8:30pm PST.
>>>>>>>
>>>>>>>     On a side note, how's the migration going to the new server?
No
>>>>>>> huge rush to get it done, mostly curiosity if everything is to
your
>>>>>>> team's satisfaction. It's not very often that I build up systems

>>>>>>> for
>>>>>>> use by folks outside of Sonic, so I'm always looking for ways
to
>>>>>>> improve the process.
>>>>>>>
>>>>>>> -- Joe Muller
>>>>>>> Sonic System Operations
>>>>>>
>>>>>>
>>>>>
>>>>
>>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message