URL Body filtering plugin for SpamPal
by Paul Wright
Filtering on spam support
Introduction
James Farmer's SpamPal is a
spam filtering tool for Windows users. It works by sitting between your ISP's
server and your mail program. When you download mail, SpamPal checks the path
the mail has taken against various publically available blacklists and marks
the mail as spam if it finds mention of a blacklisted host.
URLBody is a plugin for SpamPal which takes this a stage further.
Many blacklists will list a site for spam
support, that is, hosting the website of a spammer, even if the
spam was not sent from that site. This is a Good Thing, as many spammers
use various methods to hide where they are sending from, but they cannot
hide their website, since the point of their spam is to attract visitors
to their site, where naked cheerleaders with dodgy
degrees from unaccredited
universities will sell you herbal Viagra.
So, URLBody looks at websites mentioned in the body of
any email you receive, and marks the message as spam if any of these
sites are blacklisted. It uses the same blacklists as you have
configured SpamPal to use for your message headers. This should increase
the effectiveness of SpamPal.
Contents
What's new
URLBody has moved over to Sourceforge so that I can allow other people access
to the code more easily. Although I'll be keeping the old page as a redirect
for the foreseeable future, please update your links to urlbody.sourceforge.net. You can
view the SourceForge project
page to find files to download, the bug tracker and CVS.
I no longer have access to a Windows machine of my own, and so I'm looking for people to help maintain URLBody. If you've some knowledge of programming on Windows and would like to help, please let me know.
Getting URLBody
URLBody is bundled with recent versions of SpamPal, so you may already have it.
Check on the plugins menu for the version, and see whether there's a higher
version number available on the releases
page. If you already have the latest version, there's no need to
upgrade.
The installer
James Farmer kindly created an installer package for the URLBody. So, the easiest way to install URLBody is to download the installer and run it. It will then guide you through installation.
Installing by hand
Otherwise, you can download a zip file containing the URLBody plugin.
It is distributed in a ZIP file along with the copyright details for the
program. To use it, you'll need to create a directory called
plugins underneath the main SpamPal program directory (this may
already exist if you have other plugins installed, in which case you shouldn't
try to create it again) and create another directory called
urlbody underneath that. Put the DLL in the urlbody
directory. The main SpamPal directory is usually
C:\Program Files\Spampal\, although that might change
according to where you installed SpamPal.
Once you've put the DLL in the right place, right-click on the SpamPal
icon in the system tray, and select Options and then Plugins. You should
now select URLBody from the list and click Enable.
Disabling and uninstalling
To disable URLBody, right-click on the SpamPal icon in the system tram, and
select Options and then Plugins. Select URLBody from the list and click the
Disable button on the list.
To uninstall URLBody, use the installer's "uninstall" option if you used the
installer, or just delete the DLL if you installed it by hand.
Frequently Asked Questions
Here are some frequently asked questions and their answers:
What blacklists should I use to make URLBody effective?
I recommend the Spamhaus
SBL, as that lists the network space where the most prolific spammers
have their websites. I would also recommend blacklisting China, Korea and
Brazil (assuming you don't get any legitimate email from these countries and
don't expect to be interested in sites hosted there), as these countries are
heavily infested with American spammers' sites and full of negligent ISPs who
are only too happy to take their money.
URLBody blacklists mails which aren't spam! How do I stop it?
URLBody just uses the blacklists you've told SpamPal to use. Some blacklists
are more gung-ho than others about listing. If you have this sort of problem,
you should use a more conservative blacklist (see above for my
recommendations).
If the emails come from a particular sender, you could also use SpamPal's own
whtelisting facilities to allow them through. See the SpamPal documentation for how
to do this. You could also use SpamPal's IP address ignorelist to tell it to ignore particular IP addresses.
If a legitimate site shares an IP address with a spammer's site, won't URLBody block that?
Yes. URLBody looks up the IP address of the website and then looks up that IP
address in your blacklists. The most prolific spammers have dedicated
webservers in countries like China, and do not share IP addresses with
legitimate sites.
How can I add particular domains to a local blacklist?
People have asked for the ability to blacklist particular domains locally.
I'm considering this, but I'm not sure it's worth it. Spammers like to rotate
through domain names to foil filtering tactics like this. IP addresses are harder for them to rotate through, so look look up the IP addresses associated with
that domain and add them to SpamPal's local IP
address blacklist. The spammers will probably have a block of IP
addresses in China or Brazil, so you should probably block the entire space
containing that IP if you find spam for that domain still leaks through. If
you're not comfortable with using the IP address lookup and whois tools to work
out which IP addresses to blacklist, look up that domain at OpenRBL and note which blacklists it appears
in, and then configure SpamPal to use those.
Can I limit the DNSBLs that URLBody uses?
As people have rightly pointed out, there's not much point in URLBody checking
against, say, ORDB.
Unfortunately, there's no facility in the SpamPal Plugin interface to specify which
DNSBLs you want to use in a plugin. You can do raw DNS lookups, but then
URLBody would not take account of SpamPal's local blacklist and whitelist,
which would be bad (see the FAQs about how to get URLBody to ignore or
blacklist particular site). James Farmer knows this is something people want,
so hopefully he'll change the interface at some point.
Why was this address blacklisted? Remove me!
Please don't email me to ask why a particular
address is blacklisted, or to complain about it. URLBody just uses
the blacklists which you selected when configuring SpamPal. I've no idea
why particular addresses are listed and I can't do anything about it. If
you want more information on an address, look it up at OpenRBL. You might also want to see Bill Cole's advice for the blacklisted.
Does URLBody cope with MIME encoding and attachments?
Yes.
Does URLBody cope with URL obfuscation schemes used by spammers
Yes. I'm also thinking of marking a mail as spam if it uses such a scheme.
I found a bug! Who do I tell?
First, make sure the bug is in URLBody and not SpamPal itself or another
plugin. The easiest way to do this is to disable URLBody and see whether you
still have the problem, by repeating whatever it was you were doing when the
problem happened.
If you're pretty sure the problem is in URLBody, check the bug reports to see whether someone has already noticed it. If not, submit a new bug report.
Where can I get help with URLBody?
The
SpamPal forum for plugins is the best place to ask general questions about URLBody.
Does the DCC's licence change affect URLBody?
To handle MIME decoding, URLBody uses some functions from version 1.1.16 of the Distributed Checksum Clearinghouse (DCC).
This code was made available a couple of years ago, under the MIT Licence. I do not
intend to change this code as it seems to do the job well enough. The DCC
itself recently
became non-free, however, the right to use the old code remains
unaffected. URLBody does not make use of digital signatures to measure
bulkiness and therefore does not fall within the patent
which seems to have brought about the DCC's licence change.
Technical details
I used the lcc-win32 compiler
to create the DLL. URLBody understands MIME encoded messages
(increasingly a favourite with spammers because it makes it harder to
write filters based on the message body). The MIME decoding is done
using code from dccproc, part of Vernon Schryver's Distributed Checksum
Clearinghouse. Note that this plugin doesn't do DCC checking, it
just uses the MIME code, and that any mistakes are probably mine and not Vernon
Schryver's. The main reason for using code from dccproc is that it has a
MIT licence, which allows linking it with SpamPal. At the time I wrote the
plugin, I wasn't certain whether the other MIME libraries I found, which were
under the GPL, could legally be used with SpamPal. SpamPal was free as in beer
but closed source at that point (this is no longer the case, but Vernon's code
seems to do the job well enough so I'm not in a hurry to remove it).
Source code
Since you're allowing my code to mess with your email, you might want to
make sure I'm not doing anything bad. You can access the source code
using CVS or browse it on the
web. The source in CVS is a live snapshot of the development code, so
it may not correspond to the release that you have, or even work at all.
The source code to the last release is available in the source zip file. The source code is licensed under a MIT licence. What this means is that you
use the source code in your own programs, as long as you maintain the copyright
notices on it (see the comment at the top of each file for the full licence).
$Id: index.html,v 1.2 2005/05/03 20:50:43 pw201 Exp $