Friends of Semantic Compositions

January 2009

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

Site Statistics

Blog powered by Typepad

« What are 2^10 "real" Google hits called? | Main | Speaking carbonically »

May 25, 2004



*** SPAM *** inserted into the headers, and a spam threshold of 5.00 points, sounds very much like SpamAssassin.

SA is a weird hybrid of rule-based and Bayesian filtering, at least with the default settings intact (and it is highly configurable). It assigns scores based on regular expression matches, but also on a Bayesian estimation of the spam-icity of the mail. (A 99% spam likelihood scores 5.4 points, enough to mark the mail as spam in the abcense of other indicators.) It trains the filter automatically using messages that score especially high or low using the rules. You can also train it yourself if you like, Yahoo! may or may not be training its SA/SA-like filters on everyone's Junk folder.

I'm under the impression that statistical filters train on the headers of a mail as well as the body, so even an otherwise seemingly innocuous mail may be picked up without rules because it has suspicious "words" (hostnames, usually) in its headers. Email headers can be quite long -- mails that come through (legitimate) mailing lists for me often have headers that contain about three times as many characters as the body. So it is possible that even a purely statistical system may trap "normal looking" mails based on their headers. And besides, these messages probably resemble those annoying site stat mails from the internetseer people, which I absolutely did not sign up for and nevertheless receive regularly, pretty closely. Innocent site stats are spammier than they appear :)


However, I also have a sitemeter account which also sends emails to me collected via my yahoo address, and it always gets through as non-spam.

Maybe it is you.

Semantic Compositions

Could be Spam Assassin; for what it's worth, Yahoo claims that it's proprietary:

SpamGuard: This proprietary system is intended to radically reduce the amount of spam you receive in your inbox. SpamGuard is designed to direct most spam to your Bulk Mail folder, to help you better manage your mail.

If it walks like a duck, and talks like a duck, though...

Sadly, without paying (and for the volume of mail I get at the official SC address, it's not worth paying) there are no filter settings for me to play with. I understand that it would be somewhat more configurable if I shelled out for it, but that's not likely to happen soon.

I'm not at all prepared to rule out the possibility that it could just be me. It would hardly be the first time I've been so lucky.


My Yahoo account wavers in what it decides is spam and what isn't. This week it has decided that my Salon newsletters should be spam, but then it goes for months where it allows them into the regular inbox before spamming them again. It also occasionally puts emails from my old college roommate in the spam folder, for absolutely no reason at all. Maybe it just wants attention.

buy valtrex

What is the number of addresses that you can send a newsletter to without yahoo's spam filter stopping them?
I'm trying to email my high school newsletter to a select group of people. I set of groups of 25 and they are blocked, how low do I have to go per group?

dunk shoes

Very pleased to see a blog post, including the thoughtful and perceptive comments on set. To keep the great work!
Not everyone can provide the appropriate flow of information, thanks.

The comments to this entry are closed.