Archive

Posts Tagged ‘Email’

Is old fashion protection still getting you away from spam?

by .hac on September 12th, 2009

I disappeared again after my last post talking about spam collections and DNS misconfigurations. Today, I read log0’s post which he is calling for bots/ tools for his security research. Did you see anything familiar to you? How log0 is showing his contact to us, “log0 [ at ] gmail [ dot ] com”. We were using this format for quite some time, after we realized that showing full form of our address (eg. spam@onhacks.org) increases the chance that our email get exposed to spammers.

However, these kinds of representation already appeared on the Internet for last few years. Did you ever think of one fact is that: A clever spammers just need to modify few lines of code in their bots, changing the target strings they are looking for, then everything is just working as the same as in the past.

The most interesting thing is that RSnake has blogged his finding on this form of email representation last Tuesday. In short, he has googled with “at gmail dot com”, and surprisingly there are at least 6 email addresses in the first result page. There are many variations, but they all have the same pattern, here are some examples:

spam  at  onhacks  dot  com
spam [at] onhacks [dot] com
spam (at) onhacks (dot) com
spam <at> onhacks <dot> com
spam “at” onhacks “dot” com

(Obviously, I am trying my best to let spammers know my address)

I spent an hour to write a very simple PoC parser to retrieve email addresses from the result page mentioned above. Obviously there are at least 4 valid email addresses, it is not too hard to get those email addresses by bots. The parser is just looking for 1 ‘at’ and 1 ‘dot’ keyword appears sequentially in the pattern: [any word] “at” [any word] “dot” [any word]. The code is poorly written, I will improve it later this week.

It is not so difficult to discover the pattern between these email addresses, just a piece of cake even for primary students. Then, what kind of representation we should use to show our email address on the Internet? Display the jpeg of the email? Without adding noises to the image, it is as easy as just performing text recognition. With noises on the image, it is more like CAPTCHA. Since most of the CAPTCHA solver aims on specific type of CAPTCHA, it may takes more time to decrypt an “encrypted” email using CAPTCHA. However, it is not unsolvable.

What is the takeaway then? Better not showing your address on web! Or encrypt it into CAPTCHA, at least your email address has less chance being captured by spammers.

Email , ,

How to examine a bad word filter?

by .hac on May 31st, 2009

Testing something is always a good practice before learning how to hack something, the methodologies we use in testing sometimes are applicable in hacking. So, I am planning to write some entries related to testing in the coming few months. See if we can have discover a systematic way to hack. Here is the first challenge, it is very simple.

Problem
Network managers always want to or are forced to control the information flowing around a network. Most of the time, filtering is a good way to do the control. Inside this big category, we always like to use block list to prevent information comes in or goes out, to and from the network.

Scenarios
Flora doesn’t want her daughter wallow in Japan pop star. Flora knows that her daughter always navigate to some sites with domain name ending as ‘.jp’, she is looking for a tool that can control what kinds of websites their PC can reach.

IT administrator in PC middle school discovered that their mail system started receiving porn advertisement and students are trying to share these links through the mail system, they are planning to have a filter that can block all such mail flows.

Justin loves blogging so much, he is writing them weekly. He loves to collect and read feedbacks from the audiences. However, he hates those spammer pasting unrelated advertisement on his posts. He want to figure out a way to stop them appearing from other audiences.

Solution
The trivial filtering solution to help these people out is bad word filtering. The basic idea is the same as general block list, users can specify the tokens they want to look for when deciding to block the information. In general, there are at least two different definitions to distinguish whether we found the bad word or not. Given an input message M,

  1. Split the message M into a sequence of words Ws, we found a bad word bW is in the message only if Ws contains bW.
  2. Take the message M as an input stream, we found a bad word bW when there is a list of consecutive characters equals bW.

Both definition has there own advantages and disadvantages, but we will keep this discussion later since the current topic is how to test the filter. Let’s say we pick the first definition for our filter, then what should we test? (Take some time to think about scenarios before continue reading)

Functional Test
According the input of this filter (input message M), we can design few functional test cases. Basic scenarios are,

  • empty message [Expected: Accept];
  • only a word (either good or bad word) [Expected: good - Accept, bad - Reject];
  • two words (good and bad) with different delimiter [Expected: Depends on how the feature define delimiter];
  • a list of word and contains (0, 1, 2, all) bad words [Expected: all reject];
  • a bad word is embedded in a word (eg. assume evil is bad word, message conatins residentevil.com) [Expect: By design, this message will be accepted]

Beside these functional test cases, we should to have a lengthy message to check boundary cases of the feature. Assume the longest message we accept is N characters, we need to have message with length N, N+1 and N+2. On the other hand, globalization and localization test may be required, depends on who is your target user.

Security Concern
Then we would ask: is there other way to bypass the filter (eg. message using different encoding)? Is it possible to have code injection or script injection attack? Who can use the feature? Where is the bad word list? Who have rights to touch the list? These are security concerns when testing the feature. Drawing a data flow diagram always help to identify what kind of security issues we may have. However, this post only focus on functional testing a feature. May be next time we can discuss how to design security test cases of a feature.

Conclusion
We have only discussed some elementary skills to design the test plan of a feature. You can consider what kind of input the feature can have, both valid and invalid input. Output is another way to discover new scenarios, output is anything that the feature shown. Since we assumed that this filter only say accept or reject of a message and throw some exceptions (eg. input size exceed), the test cases we found here are almost dominated by what we found with the input. Now, you are able to test your program more systematically!

Have a good weekend!

Practice (Just for fun)
Should you want to have some practice, we can discuss how to test an IP block list filter. Here is a simple definition:

INPUT: Only allow IPv4 address, one at a time
IMPLEMENTATION: An IP block list is stored as a text file in the same folder of the filter, user need to directly modify the text file if he want to Add/Remove/Edit an IP address in the block list. The filter will perform a binary search to see if the input address is on the list. If it is, then it will announce reject, otherwise output accept.
OUTPUT: Accept/ Reject the address

Email, Testing , , , ,

RE : Encryption VS Compression

by log0 on April 12th, 2009

LP gave a very good reply to the topic of “encrypt-and-compress” or “compress-and-encrypt” , and it is worth highlighting here.

The reason why compression works is that the plaintext contains redundancy. E.g. there are certain patterns in the text, character frequencies are not uniform, etc.

On the other hand, a good encryption algorithm should exhibit good diffusion and confusion. In short, it means that encrypted data should be indistinguishable from random noise. It is obvious that this property should hold regardless of the plaintext, otherwise the encryption algorithm is broken.

Therefore, compress-and-encrypt produces smaller output with no security compromise per se, but encrypt-and-compress is like feeding random noise (whose redundancy is greatly reduced) into the compression algorithm with no obvious security benefit.

In short, encrypt-and-compress poses no obvious security benefit. Moreover, given that a good compression algorithm should be like real noise, and should not contain pattern, it follows that there will be no obvious storage benefit, either.

Email, Protocol , ,