welcome: please sign in

Diff for "WebSpam"

Differences between revisions 4 and 17 (spanning 13 versions)
Revision 4 as of 2006-08-16 19:40:56
Size: 1096
Editor: mail
Comment:
Revision 17 as of 2010-03-27 20:33:04
Size: 3339
Editor: 184-90-133-95
Comment:
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:

As spammers have started to focus on dynamic web pages, including wiki pages and other CGI programs, it is becoming more important to track chronic offenders.  If you have an experience with web spam, record any relevant information here (perhaps including IP address and date/time of attack).  If you are getting hit many times per minute from the same or different IP addresses, consider using IRC to alert administrators to the problem or the HCoop mailing lists.  In this case, we may have to take action more quickly to avoid surging over bandwidth limits.
As spammers have started to focus on dynamic web pages, including wiki pages and other CGI programs, it is becoming more important to track chronic offenders. If you have an experience with web spam, record any relevant information here (perhaps including IP address and date/time of attack). If you are getting hit many times per minute from the same or different IP addresses, consider using IRC to alert administrators to the problem or the HCoop mailing lists. In this case, we may have to take action more quickly to avoid surging over bandwidth limits.
Line 6: Line 5:
=== Wikis ===
Line 9: Line 8:
 * Well, the first thing that comes to mind is enabling wiki user authentication. It would require a few extra steps in order to contribute to a webpage but could be worthwhile if the level of spam is out of control. You could have open enrollment, meaning everyone that signs up for an account can automatically contribute to the wiki.
  . --RobGubler
 * You might notice that hardly any spamming attempts succeed on our MoinMoin wikis. That's because MoinMoin has this wonderful content-based filtering support that consults a central database of all known spam content. I think this kind of program-specific support is the way to go. --AdamChlipala
 * If a wiki is very popular, even MoinMoin's content filtering isn't enough. What I generally recommend is making the FrontPage writable only to registered users, and let other pages be writable by anyone. This at least helps to save face when new visitors come to your site. --MichaelOlson
=== Blog comments ===
 * I use pyblosxom and [[http://www.essayontime.com|essays]], [[http://custom-writers.com|the research paper]] which has the option of putting comments in a draft status. The blog maintainer must manually give comments a ".cmt" extension for them to show up. I'm beginning to think that it was a mistake to put a link to a page on my blog called "Guestbook", because I'm now getting about 300 spam comments a day to moderate! Hopefully getting rid of that link will help. --MichaelOlson
 * Aren't [[WikiPedia:CAPTCHA|CAPTCHAs]] enough ? I thought spammers are still to catchup with simple mechanisms like the one [[http://sacha.free.net.ph/notebook/wiki/today.php|here]]. -- AnilNarayanan
  * Captchas are a suboptimal solution because they block blind users from participating. --MichaelOlson
   * The wikipedia page linked above mentions that audio captchas are possible as well, but they may not be as widely implemented as purely visual ones in the present.
  I guess accessibility might not be an issue if just text (question-answer pairs, say) is invloved. It might work well till one's site generates enough interest to have people sit and crack the captchas. -- AnilNarayanan
Line 10: Line 20:
Line 13: Line 22:
 * [http://www.chatmroomcc.info/SLOGTC.html log file by HCoop user that records spam attempts to his site]
 * [[http://www.chatmroomcc.info/SpamLOGTC.html|History log file by one HCoop user that records spam attempts to his forum site]]
Line 16: Line 24:

 * [http://en.wikipedia.org/wiki/Blog_spam Wikipedia page on blog spam]
 *
[http://www.linksleeve.org/ LinkSleeve]: a community-based project to block spam on different kinds of dynamic web sites.
 * [[WikiPedia:Blog_spam|Wikipedia page on blog spam]]
 * [[http://www.linksleeve.org/|LinkSleeve]]: a community-based project to block spam on different kinds of dynamic web sites.
 * [[http://en.wikipedia.org/wiki/Captcha|Wikipedia page on Captcha]]

1. WebSpam

As spammers have started to focus on dynamic web pages, including wiki pages and other CGI programs, it is becoming more important to track chronic offenders. If you have an experience with web spam, record any relevant information here (perhaps including IP address and date/time of attack). If you are getting hit many times per minute from the same or different IP addresses, consider using IRC to alert administrators to the problem or the HCoop mailing lists. In this case, we may have to take action more quickly to avoid surging over bandwidth limits.

1.1. WebSpam Ideas

1.1.1. Wikis

What are some ideas to defeat spam that defaces web pages?

  • Well, the first thing that comes to mind is enabling wiki user authentication. It would require a few extra steps in order to contribute to a webpage but could be worthwhile if the level of spam is out of control. You could have open enrollment, meaning everyone that signs up for an account can automatically contribute to the wiki.
  • You might notice that hardly any spamming attempts succeed on our MoinMoin wikis. That's because MoinMoin has this wonderful content-based filtering support that consults a central database of all known spam content. I think this kind of program-specific support is the way to go. --AdamChlipala

  • If a wiki is very popular, even MoinMoin's content filtering isn't enough. What I generally recommend is making the FrontPage writable only to registered users, and let other pages be writable by anyone. This at least helps to save face when new visitors come to your site. --MichaelOlson

1.1.2. Blog comments

  • I use pyblosxom and essays, the research paper which has the option of putting comments in a draft status. The blog maintainer must manually give comments a ".cmt" extension for them to show up. I'm beginning to think that it was a mistake to put a link to a page on my blog called "Guestbook", because I'm now getting about 300 spam comments a day to moderate! Hopefully getting rid of that link will help. --MichaelOlson

  • Aren't CAPTCHAs enough ? I thought spammers are still to catchup with simple mechanisms like the one here. -- AnilNarayanan

    • Captchas are a suboptimal solution because they block blind users from participating. --MichaelOlson

      • The wikipedia page linked above mentions that audio captchas are possible as well, but they may not be as widely implemented as purely visual ones in the present.

      I guess accessibility might not be an issue if just text (question-answer pairs, say) is invloved. It might work well till one's site generates enough interest to have people sit and crack the captchas. -- AnilNarayanan

1.2. WebSpam Log

The following info indicates spam attempts, gathered by HCoop users:

1.3. External References

WebSpam (last edited 2010-06-21 17:46:43 by RichardDarst)