welcome: please sign in

Diff for "WebSpam"

Differences between revisions 1 and 14 (spanning 13 versions)
Revision 1 as of 2006-08-16 17:43:19
Size: 599
Editor: mail
Comment:
Revision 14 as of 2007-11-13 01:59:55
Size: 3246
Comment:
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:
As spammers have started to focus on dynamic web pages, including wiki pages and other CGI programs, it is becoming more important to track chronic offenders. If you have an experience with web spam, record any relevant information here (perhaps including IP address and date/time of attack). If you are getting hit many times per minute from the same or different IP addresses, consider using IRC to alert administrators to the problem or the HCoop mailing lists. In this case, we may have to take action more quickly to avoid surging over bandwidth limits.
Line 3: Line 4:
As spammers have started to focus on dynamic web pages, including wiki pages and other CGI programs, it is becoming more important to track chronic offenders. If you have an experience with web spam, record any relevant information here (perhaps including IP address and date/time of attack). If you are getting hit many times per minute from the same or different IP addresses, consider using IRC to alert administrators to the problem or the HCoop mailing lists. In this case, we may have to take action more quickly to avoid surging over bandwidth limits. == WebSpam Ideas ==
=== Wikis ===
What are some ideas to defeat spam that defaces web pages?

 * Well, the first thing that comes to mind is enabling wiki user authentication. It would require a few extra steps in order to contribute to a webpage but could be worthwhile if the level of spam is out of control. You could have open enrollment, meaning everyone that signs up for an account can automatically contribute to the wiki.
  . --RobGubler
 * You might notice that hardly any spamming attempts succeed on our MoinMoin wikis. That's because MoinMoin has this wonderful content-based filtering support that consults a central database of all known spam content. I think this kind of program-specific support is the way to go. --AdamChlipala
 * If a wiki is very popular, even MoinMoin's content filtering isn't enough. What I generally recommend is making the FrontPage writable only to registered users, and let other pages be writable by anyone. This at least helps to save face when new visitors come to your site. --MichaelOlson
=== Blog comments ===
 * I use pyblosxom, which has the option of putting comments in a draft status. The blog maintainer must manually give comments a ".cmt" extension for them to show up. I'm beginning to think that it was a mistake to put a link to a page on my blog called "Guestbook", because I'm now getting about 300 spam comments a day to moderate! Hopefully getting rid of that link will help. --MichaelOlson
 * Aren't [wiki:WikiPedia:CAPTCHA CAPTCHAs] enough ? I thought spammers are still to catchup with simple mechanisms like the one [http://sacha.free.net.ph/notebook/wiki/today.php here]. -- AnilNarayanan
  * Captchas are a suboptimal solution because they block blind users from participating. --MichaelOlson
   * The wikipedia page linked above mentions that audio captchas are possible as well, but they may not be as widely implemented as purely visual ones in the present.
  I guess accessibility might not be an issue if just text (question-answer pairs, say) is invloved. It might work well till one's site generates enough interest to have people sit and crack the captchas. -- AnilNarayanan
Line 6: Line 20:
The following info indicates spam attempts, gathered by HCoop users:

 * [http://www.chatmroomcc.info/SpamLOGTC.html History log file by one HCoop user that records spam attempts to his forum site]
== External References ==
 * [wiki:WikiPedia:Blog_spam Wikipedia page on blog spam]
 * [http://www.linksleeve.org/ LinkSleeve]: a community-based project to block spam on different kinds of dynamic web sites.
 * [http://en.wikipedia.org/wiki/Captcha Wikipedia page on Captcha]

1. WebSpam

As spammers have started to focus on dynamic web pages, including wiki pages and other CGI programs, it is becoming more important to track chronic offenders. If you have an experience with web spam, record any relevant information here (perhaps including IP address and date/time of attack). If you are getting hit many times per minute from the same or different IP addresses, consider using IRC to alert administrators to the problem or the HCoop mailing lists. In this case, we may have to take action more quickly to avoid surging over bandwidth limits.

1.1. WebSpam Ideas

1.1.1. Wikis

What are some ideas to defeat spam that defaces web pages?

  • Well, the first thing that comes to mind is enabling wiki user authentication. It would require a few extra steps in order to contribute to a webpage but could be worthwhile if the level of spam is out of control. You could have open enrollment, meaning everyone that signs up for an account can automatically contribute to the wiki.
  • You might notice that hardly any spamming attempts succeed on our MoinMoin wikis. That's because MoinMoin has this wonderful content-based filtering support that consults a central database of all known spam content. I think this kind of program-specific support is the way to go. --AdamChlipala

  • If a wiki is very popular, even MoinMoin's content filtering isn't enough. What I generally recommend is making the FrontPage writable only to registered users, and let other pages be writable by anyone. This at least helps to save face when new visitors come to your site. --MichaelOlson

1.1.2. Blog comments

  • I use pyblosxom, which has the option of putting comments in a draft status. The blog maintainer must manually give comments a ".cmt" extension for them to show up. I'm beginning to think that it was a mistake to put a link to a page on my blog called "Guestbook", because I'm now getting about 300 spam comments a day to moderate! Hopefully getting rid of that link will help. --MichaelOlson

  • Aren't [wiki:CAPTCHA CAPTCHAs] enough ? I thought spammers are still to catchup with simple mechanisms like the one [http://sacha.free.net.ph/notebook/wiki/today.php here]. -- AnilNarayanan

    • Captchas are a suboptimal solution because they block blind users from participating. --MichaelOlson

      • The wikipedia page linked above mentions that audio captchas are possible as well, but they may not be as widely implemented as purely visual ones in the present.

      I guess accessibility might not be an issue if just text (question-answer pairs, say) is invloved. It might work well till one's site generates enough interest to have people sit and crack the captchas. -- AnilNarayanan

1.2. WebSpam Log

The following info indicates spam attempts, gathered by HCoop users:

1.3. External References

WebSpam (last edited 2010-06-21 17:46:43 by RichardDarst)