URL Body filtering plugin for SpamPal

by Paul Wright

Filtering on spam support



Introduction

James Farmer's SpamPal is a spam filtering tool for Windows users. It works by sitting between your ISP's server and your mail program. When you download mail, SpamPal checks the path the mail has taken against various publically available blacklists and marks the mail as spam if it finds mention of a blacklisted host.

URLBody is a plugin for SpamPal which takes this a stage further.

Many blacklists will list a site for spam support, that is, hosting the website of a spammer, even if the spam was not sent from that site. This is a Good Thing, as many spammers use various methods to hide where they are sending from, but they cannot hide their website, since the point of their spam is to attract visitors to their site, where naked cheerleaders with dodgy degrees from unaccredited universities will sell you herbal Viagra.

So, URLBody looks at websites mentioned in the body of any email you receive, and marks the message as spam if any of these sites are blacklisted. It uses the same blacklists as you have configured SpamPal to use for your message headers. This should increase the effectiveness of SpamPal.

Contents

What's new

URLBody has moved over to Sourceforge so that I can allow other people access to the code more easily. Although I'll be keeping the old page as a redirect for the foreseeable future, please update your links to urlbody.sourceforge.net. You can view the SourceForge project page to find files to download, the bug tracker and CVS.

I no longer have access to a Windows machine of my own, and so I'm looking for people to help maintain URLBody. If you've some knowledge of programming on Windows and would like to help, please let me know.

Getting URLBody

URLBody is bundled with recent versions of SpamPal, so you may already have it. Check on the plugins menu for the version, and see whether there's a higher version number available on the releases page. If you already have the latest version, there's no need to upgrade.

The installer

James Farmer kindly created an installer package for the URLBody. So, the easiest way to install URLBody is to download the installer and run it. It will then guide you through installation.

Installing by hand

Otherwise, you can download a zip file containing the URLBody plugin. It is distributed in a ZIP file along with the copyright details for the program. To use it, you'll need to create a directory called plugins underneath the main SpamPal program directory (this may already exist if you have other plugins installed, in which case you shouldn't try to create it again) and create another directory called urlbody underneath that. Put the DLL in the urlbody directory. The main SpamPal directory is usually C:\Program Files\Spampal\, although that might change according to where you installed SpamPal.

Once you've put the DLL in the right place, right-click on the SpamPal icon in the system tray, and select Options and then Plugins. You should now select URLBody from the list and click Enable.

Disabling and uninstalling

To disable URLBody, right-click on the SpamPal icon in the system tram, and select Options and then Plugins. Select URLBody from the list and click the Disable button on the list.

To uninstall URLBody, use the installer's "uninstall" option if you used the installer, or just delete the DLL if you installed it by hand.

Frequently Asked Questions

Here are some frequently asked questions and their answers:

What blacklists should I use to make URLBody effective?

I recommend the Spamhaus SBL, as that lists the network space where the most prolific spammers have their websites. I would also recommend blacklisting China, Korea and Brazil (assuming you don't get any legitimate email from these countries and don't expect to be interested in sites hosted there), as these countries are heavily infested with American spammers' sites and full of negligent ISPs who are only too happy to take their money.

URLBody blacklists mails which aren't spam! How do I stop it?

URLBody just uses the blacklists you've told SpamPal to use. Some blacklists are more gung-ho than others about listing. If you have this sort of problem, you should use a more conservative blacklist (see above for my recommendations).

If the emails come from a particular sender, you could also use SpamPal's own whtelisting facilities to allow them through. See the SpamPal documentation for how to do this. You could also use SpamPal's IP address ignorelist to tell it to ignore particular IP addresses.

If a legitimate site shares an IP address with a spammer's site, won't URLBody block that?

Yes. URLBody looks up the IP address of the website and then looks up that IP address in your blacklists. The most prolific spammers have dedicated webservers in countries like China, and do not share IP addresses with legitimate sites.

How can I add particular domains to a local blacklist?

People have asked for the ability to blacklist particular domains locally. I'm considering this, but I'm not sure it's worth it. Spammers like to rotate through domain names to foil filtering tactics like this. IP addresses are harder for them to rotate through, so look look up the IP addresses associated with that domain and add them to SpamPal's local IP address blacklist. The spammers will probably have a block of IP addresses in China or Brazil, so you should probably block the entire space containing that IP if you find spam for that domain still leaks through. If you're not comfortable with using the IP address lookup and whois tools to work out which IP addresses to blacklist, look up that domain at OpenRBL and note which blacklists it appears in, and then configure SpamPal to use those.

Can I limit the DNSBLs that URLBody uses?

As people have rightly pointed out, there's not much point in URLBody checking against, say, ORDB. Unfortunately, there's no facility in the SpamPal Plugin interface to specify which DNSBLs you want to use in a plugin. You can do raw DNS lookups, but then URLBody would not take account of SpamPal's local blacklist and whitelist, which would be bad (see the FAQs about how to get URLBody to ignore or blacklist particular site). James Farmer knows this is something people want, so hopefully he'll change the interface at some point.

Why was this address blacklisted? Remove me!

Please don't email me to ask why a particular address is blacklisted, or to complain about it. URLBody just uses the blacklists which you selected when configuring SpamPal. I've no idea why particular addresses are listed and I can't do anything about it. If you want more information on an address, look it up at OpenRBL. You might also want to see Bill Cole's advice for the blacklisted.

Does URLBody cope with MIME encoding and attachments?

Yes.

Does URLBody cope with URL obfuscation schemes used by spammers

Yes. I'm also thinking of marking a mail as spam if it uses such a scheme.

I found a bug! Who do I tell?

First, make sure the bug is in URLBody and not SpamPal itself or another plugin. The easiest way to do this is to disable URLBody and see whether you still have the problem, by repeating whatever it was you were doing when the problem happened.

If you're pretty sure the problem is in URLBody, check the bug reports to see whether someone has already noticed it. If not, submit a new bug report.

Where can I get help with URLBody?

The SpamPal forum for plugins is the best place to ask general questions about URLBody.

Does the DCC's licence change affect URLBody?

To handle MIME decoding, URLBody uses some functions from version 1.1.16 of the Distributed Checksum Clearinghouse (DCC). This code was made available a couple of years ago, under the MIT Licence. I do not intend to change this code as it seems to do the job well enough. The DCC itself recently became non-free, however, the right to use the old code remains unaffected. URLBody does not make use of digital signatures to measure bulkiness and therefore does not fall within the patent which seems to have brought about the DCC's licence change.

Technical details

I used the lcc-win32 compiler to create the DLL. URLBody understands MIME encoded messages (increasingly a favourite with spammers because it makes it harder to write filters based on the message body). The MIME decoding is done using code from dccproc, part of Vernon Schryver's Distributed Checksum Clearinghouse. Note that this plugin doesn't do DCC checking, it just uses the MIME code, and that any mistakes are probably mine and not Vernon Schryver's. The main reason for using code from dccproc is that it has a MIT licence, which allows linking it with SpamPal. At the time I wrote the plugin, I wasn't certain whether the other MIME libraries I found, which were under the GPL, could legally be used with SpamPal. SpamPal was free as in beer but closed source at that point (this is no longer the case, but Vernon's code seems to do the job well enough so I'm not in a hurry to remove it).

Source code

Since you're allowing my code to mess with your email, you might want to make sure I'm not doing anything bad. You can access the source code using CVS or browse it on the web. The source in CVS is a live snapshot of the development code, so it may not correspond to the release that you have, or even work at all.

The source code to the last release is available in the source zip file. The source code is licensed under a MIT licence. What this means is that you use the source code in your own programs, as long as you maintain the copyright notices on it (see the comment at the top of each file for the full licence).


$Id: index.html,v 1.2 2005/05/03 20:50:43 pw201 Exp $