5 things you probably did not know about the spammers who spam your website(Comments RSS)
Due to shoddy spam script programming and bandwidth saving attempts on the spammers part a legitimate user will go through various steps that spammers often will not.
Capitalizing on this will significantly reduce spam.
Furthermore it is possible you probably have a few design flaws in your site that make it exploitable (do you really need commenters being able to reply to themselves?)
Here is what I have learnt:
1. Spammers do not always follow re-directs (even less follow in-content links), real clients will.
Suggestion: Check that the IP address of the user loading a second page is the same IP address of the user who sent the post. Only submit a "Your post was successful" if it is. You will ban a large amount of spam clients from placing in redirects but a small few can still follow them, place a link on your page ("Thanks, click here to submit your comment.") to the next page instead of forcing a redirect.
Why: If you try to redirect them to another page they more often than not will not bother to follow it. Real clients automatically will. Take advantage of this.
Even though I am far from a TCP guru I still personally believe TCP IP address spoofing is even today still possible (I have seen enough comments from invalid IPs in IIS's log file to believe this).
If the source address is incorrect, they will not be able to receive your replies. Their spam is submitted blind. Send a unique code that must be requested on the second page to eliminate this. This will also block anonymous TOR clients.
2. Spammers often disconnect from the returned page early (even before confirmation) and do not wait for it to fully load.
Suggestion: Before telling the client the submission was successful, consider checking to see if the client is still connected (ASP: If Response.IsClientConnected) if your page is long (which ironically helps) they probably have not read it (why load a long "Thanks for your comment" page when you can just disconnect and start spamming other sites?). If so do not allow the comment. Beware, it is possible a legitimate user's connection was dropped, so if they refresh the page (which they will) it should still work.
3. Spammers often keep spamming the same form continuously.
Suggestion: Consider whether multiple form posts (before another IP address comments/replies) would be legitimate, if not, block it.
Why: It is rare on a site that a user would reply to their own comment. Prevent flooding.
4. Spammers do not always request the comments page (they just learn what a valid POST is and keep submitting it).
Suggestion: Try using hidden form values for every entry + check disabled forms are disabled.
Why: A user cannot easily spam 200 of your blog entries if they need to have loaded your comments page for each one to get the unique code for each entry. Also even if you have disabled comments on a page, make sure your form handling code knows it should not be accepting comments too. The lack of a comment form does not always prevent HTTP POSTS.
5. Spammers might pretend to be a proxy so you block the wrong address.
Suggestion: Consider being able to blanket ban a proxy and look to see what that proxy is (IP Whois) before banning it.
Why: Forwarded-For is only as creditable as the proxy that sends it. It is trivial to add a x-forwarded-for header. Rely on it and you may be blocking legitimate users.
I have successfully blocked thousands and thousands of spam comments this way. Maybe it is time you did too?