.NET email spam filter

Mail.dll .NET email component includes high accuracy anti-spam filter.

It uses enhanced naive Bayesian classifier, specifically modified to handle email messages. Bayesian spam filters are a very powerful technique for dealing with spam.

In our tests we achieved 99,6% accuracy with very low false positive spam detection rates (9 false positives in 54’972 emails tested – that’s 0.016%).

Training

First in the learning phase, you need to teach the classifier to recognize spam and non-spam (ham) messages. You need to prepare 100-200 spam and ham messages.

I suggest using following folder structure:

“Learn” folder is used for training the filter. Both spam and ham folders should contain around 100-200 messages each (the more the better). The number of messages in spam and ham folders must be equal. You can find a spam archive on the bottom of the article.

Messages must be in eml format with correct line endings (rn or 13 10 hex).

Now we use SpamFilterTeacher class to teach BayesianMailFilter:

// C#
using Limilabs.Mail.Tools.Spam;

BayesianMailFilter filter = new BayesianMailFilter();
SpamFilterTeacher teacher = new SpamFilterTeacher(filter);
teacher.TeachSpam(@"c:\bayes\learn\spam");
teacher.TeachHam(@"c:\bayes\learn\ham");

Testing

“Test” folder is used for testing our filter:

// C#

SpamTestResults r = teacher.Test(
    @"c:\bayes\test\spam",
    @"c:\bayes\test\ham");

Console.WriteLine(r);
r.FalsePositives.ForEach(Console.WriteLine);
r.NotMarkedAsSpam.ForEach(Console.WriteLine);

The results should be similar to this:

Accuracy=0.9949, False positives=9, Not marked as spam=271, Tests count=54972
c:\bayes\test\ham/16874.eml
...

When the filter is trained and the results are satisfactory, you can save it to disk:

// C#

filter.Save(@"c:\20111022.mbayes");

Using

You can load the filter from disk and check individual messages:

// C#

BayesianMailFilter filter = new BayesianMailFilter();
filter.Load(@"c:\20111022.mbayes");

// you can use Mail.dll to download mesage from POP3 or IMAP server:
var eml = ...

IMail email = new MailBuilder().CreateFromEml(eml);

SpamResult result = filter.Examine(email);
Console.WriteLine(result.Probability);
Console.WriteLine(result.IsSpam);

If the filter incorrectly recognizes the message you can train it again:

// C#

filter.LearnSpam(email);
// - or -
filter.LearnHam(email);

filter.Save(@"c:\20111022.mbayes");

Spam archives

For most recent spam you can check this great archive: http://www.untroubled.org/spam/.
Unfortunately messages don’t have correct extension (*.eml) and line endings are incorrect.

You can download spam archive including 7874 spam messages from Oct 2011 here:
/static/mail/spam/spam201110.zip

OAuth 1.0 with Gmail (deprecated)

OAuth is an open protocol to allow secure API authorization in a simple and standard method from desktop and web applications.

In this post I’ll show how to access Gmail account using 3-legged OAuth authentication method with Mail.dll .NET IMAP component. The key advantage of this method is that it allows an application to access user email without knowing user’s password.

You can read more on OAuth authentication with Google accounts here:
http://code.google.com/apis/accounts/docs/OAuth_ref.html

Gmail IMAP and SMTP using OAuth:
http://code.google.com/apis/gmail/oauth/protocol.html

If your application/website is not registered, you should use following key and secret:
consumer key: “anonymous”
consumer secret: “anonymous”

Remember to add reference to Mail.dll and appropriate namespaces.

// C#

using Limilabs.Client.IMAP;
using Limilabs.Client.Authentication;
using Limilabs.Client.Authentication.Google;

const string consumerKey = "anonymous";
const string consumerSecret = "anonymous";

GmailOAuth oauth = new GmailOAuth(consumerKey, consumerSecret);

string url = oauth.GetAuthorizationUrl("http://localhost:64119/");

// ASP.NET client:
// Save oauth in permanent storage:
// Cache[oauth.RequestToken.Token] = oauth;

// Windows client:
Process.Start(url);

// ASP.NET client:
// Response.Redirect(url);

// Windows client with url:
string rawReturnUrl = Console.ReadLine();
ReturnUrl returnUrl = new ReturnUrl(rawReturnUrl);
oauth.GetAccessToken(returnUrl.OAuthVerifier);

// Windows client with verification code (oob):
// string oauthVerifier = HttpUtility.UrlDecode(Console.ReadLine());
// oauth.GetAccessToken(oauthVerifier);

// ASP.NET client:
// ReturnUrl returnUrl = new ReturnUrl(Request.RawUrl);
// Retrieve oauth from permanent storage:
// GmailOAuth oauth = Cache[returnUrl.OAuthToken]
// oauth.GetAccessToken(returnUrl.OAuthVerifier);

using (Imap client = new Imap())
{
    client.ConnectSSL("imap.gmail.com");
    string oauthImapKey = oauth.GetXOAuthKeyForImap();
    client.LoginOAUTH(oauthImapKey);

    // Now you can access user's emails
    //...

    client.Close();
    oauth.RevokeToken(oauthImapKey);
}

1.
GmailOAuth.GetAuthorizationUrl method returns url you should redirect your user to, so he can authorize access.
As you can see, Mail.dll is asking for access to user’s email information and Gmail access:

2.
If you don’t specify callback parameter, user will have to manually copy&paste the token to your application (oob):

In case of a web project, you can specify a web address on your website. oauth_verifier will be included as the redirection url parameter.

After the redirection, your website/application needs to read oauth_verifier query parameter:

3.
GmailOAuth.GetAccessToken method authorizes the token.

4.
GmailOAuth.GetXOAuthKeyForImap method uses Google API to get the email address of the user, and generates XOAuth key for IMAP protocol (you can use GetXOAuthKeyForSmtp for SMTP).

5.
GmailOAuth.RevokeToken method revokes XOAuth key, so no further access can be made with it.

…and finally VB.NET version of the code:

' VB.NET

Imports Limilabs.Client.IMAP
Imports Limilabs.Client.Authentication
Imports Limilabs.Client.Authentication.Google

Const  consumerKey As String = "anonymous"
Const  consumerSecret As String = "anonymous"

Dim oauth As New GmailOAuth(consumerKey, consumerSecret)

Dim url As String = oauth.GetAuthorizationUrl("http://localhost:64119/")

' ASP.NET client:
' Save oauth in permanent storage:
' Cache[oauth.RequestToken.Token] = oauth;

' Windows client:
Process.Start(url)

' ASP.NET client:
' Response.Redirect(url)

' Windows client with url:
Dim rawReturnUrl As String = Console.ReadLine()
Dim returnUrl As New ReturnUrl(rawReturnUrl)
oauth.GetAccessToken(returnUrl.OAuthVerifier)

' Windows client with verification code (oob):
' Dim oauthVerifier As String = HttpUtility.UrlDecode(Console.ReadLine())
' oauth.GetAccessToken(oauthVerifier)

' ASP.NET client:
' Dim returnUrl As New ReturnUrl(Request.RawUrl)
' Retrive oauth from permanent storage:
' Dim oauth As GmailOAuth = Cache(returnUrl.OAuthToken)
' oauth.GetAccessToken(returnUrl.OAuthVerifier)

Using client As New Imap()
	client.ConnectSSL("imap.gmail.com")
	Dim oauthImapKey As String = oauth.GetXOAuthKeyForImap()
	client.LoginOAUTH(oauthImapKey)

	' Now you can access user's emails
	'...

	client.Close()
	oauth.RevokeToken(oauthImapKey)
End Using

2-legged OAuth with Gmail

OAuth is an open protocol to allow secure API authorization in a simple and standard method from desktop and web applications.

In this post I’ll show how to access Gmail account using 2-legged OAuth authentication method and .NET IMAP component. The basic idea is that domain administrator can use this method to access user email without knowing user’s password.

You can read more on OAuth authentication with Google accounts here:
http://code.google.com/apis/accounts/docs/OAuth_ref.html

Gmail IMAP and SMTP using OAuth:
http://code.google.com/apis/gmail/oauth/protocol.html

Remember to add reference to Mail.dll and appropriate namespaces.

// C#

using Limilabs.Client.IMAP;
using Limilabs.Client.Authentication;
using Limilabs.Client.Authentication.Google;

const string consumerKey = "example.com";
const string consumerSecret = "secret";
const string email = "pat@example.com";

Gmail2LeggedOAuth oauth = new Gmail2LeggedOAuth(
    consumerKey, consumerSecret);

using (Imap client = new Imap())
{
    client.ConnectSSL("imap.gmail.com");

    string oauthImapKey = oauth.GetXOAuthKeyForImap(email);

    client.LoginOAUTH(oauthImapKey);

    //...

    client.Close();
}
' VB.NET

Imports Limilabs.Client.IMAP
Imports Limilabs.Client.Authentication
Imports Limilabs.Client.Authentication.Google

Const  consumerKey As String = "example.com"
Const  consumerSecret As String = "secret"
Const  email As String = "pat@example.com"

Dim oauth As New Gmail2LeggedOAuth(consumerKey, consumerSecret)

Using client As New Imap()
	client.ConnectSSL("imap.gmail.com")

	Dim oauthImapKey As String = oauth.GetXOAuthKeyForImap(email)

	client.LoginOAUTH(oauthImapKey)

	'...

	client.Close()
End Using

Here are the google apps configuration screens:

FTP Active vs Passive

Ftp.dll .NET FTP component supports both Active and Passive mode FTP transfers.

In Active mode client waits for incomming data connections, in Passive mode client establishes data connections.

Passive mode is default. You can switch to Active mode using Mode property:

// C#

using (Ftp client = new Ftp())
{
    client.Mode = FtpMode.Active;

    client.Connect("ftp.example.com");
    client.Login("user", "password");

    // ...

    client.Close();
}

' VB.NET

Using client As New Ftp()
    client.Mode = FtpMode.Active

    client.Connect("ftp.example.com")
    client.Login("user", "password")

    ' ...

    client.Close()
End Using

Specify different port for FTP

With Ftp.dll .NET FTP component establishing connection using default port is easy:

// C#

client.Connect("ftp.example.com");
' VB.NET

client.Connect("ftp.example.com")

If you need to specify different port just use overloaded version of Connect method:

// C#

client.Connect("ftp.example.com", 999);
// -or-
client.Connect("ftp.example.com", 999, false);
' VB.NET

client.Connect("ftp.example.com", 999)
' -or-
client.Connect("ftp.example.com", 999, False)

If you are using SSL:

// C#

client.ConnectSSL("ftp.example.com", 999);
// -or-
client.Connect("ftp.example.com", 999, true);
' VB.NET

client.ConnectSSL("ftp.example.com", 999)
' -or-
client.Connect("ftp.example.com", 999, True)

You can also specify the port range used in Active mode.

// C#

client.ActiveModePorts = new Range(1024, 1025);
' VB.NET

client.ActiveModePorts = New Range(1024, 1025)

You can set the IP address announced to the FTP server in Active mode data transfer.
By default, the value of this property is IPAddress.None which means that the address of the network interface is used instead.

// C#

client.ActiveModeAddress = IPAddress.Parse("ftp.example.com");
' VB.NET

client.ActiveModeAddress = IPAddress.Parse("ftp.example.com")