Want to really study botnets? Build your own

Written by John P Mello Jr on January 17, 2011

As all spam fighters know, the enemy’s prime vehicle for choking the Internet with junk mail is the botnet. Those networks of hijacked personal computers are used for a variety of mischief. Not only do they pump out spam, but their computer clients steal confidential information, propagate malware and deny access to Internet services.

Over the years, researchers have developed a number of ways to study botnets in the hope of devising monkey wrenches to throw into their operations. There’s analytical modeling, simulation studies and even experimenting with the nasty nets in the wild. An international team of computer scientists, though, have developed an innovative way to study botnets. They built their own from the ground up in a laboratory.

Emulating a botnet in a lab averts some of the pitfalls found with other testing methods, the octet of boffins argue in a paper they recently released titled “The case for in-the-lab botnet experimentation: creating and taking down a 3000-node botnet“.

Analytical models, for example, are complex and often difficult to understand. Simulations typically don’t account for all the moving parts of a botnet.

In-the-wild studies have a number of deficiencies, too. For example, in order to scrutinize the botnet, researchers need to add “entities” to the botnet. If too few entities are added, proper analysis of the botnet may be elusive. If too many entities are introduced, they may tip off the operator of the botnet that something is amiss and that could poison the research.

What’s more, some governments have made it illegal to create entities that join botnets, which opens an ethical can of worms for researchers. Add to that some researchers have had their domains attacked by the botnets they were trying to study in the wild and the difficulty with getting statistically significant results from in-wild research, and you can see why researchers are looking for a better way to study the nefarious networks.

Emulation can be a safer and more accurate way to study botnets outside the wild, the researchers maintain. Not only do emulations give researchers more control over their experimental environment, but it allows them to collect information that would be difficult or virtually impossible to collect in the wild. In addition, by running botnet fighting schemes within an emulated environment, the effectiveness of those schemes can be maximized before they’re actually used against a real botnet. What’s more, emulation studies can be completed much faster than similar research performed in the wild.

For their emulation the researchers created a Waledec botnet consisting of nearly 3000 nodes. To create the nodes, the researchers relied heavily on virtualization. They used a cluster for their emulation platform consisting of 98 blade computers in two 42U racks, each blade with a quad core processor, eight gigabytes of RAM, dual 136GB SCSI drives and a network card with four gigabit Ethernet ports.

Waledec botnets consist of four layers. The first layer consists of bots called spammers. About 80 percent of the botnet is made up of the spammers. They’re like the drones in a beehive. They send spam, harvest email addresses from an infected machine and capture confidential information from network traffic that passes through it. Between the spammers and the brains of the network–the command and control (CaC) server–are two additional layers, repeaters and protectors.

Spammers contact repeaters based on contact information for 100 to 500 of them on a list, called the RList, embedded on the spammers. The spammers send “job” requests to the repeaters on the RList, which forward them to the protectors, which send them on to the CaC server. After receiving the job request, the CaC server issues a job order that travels through the protectors and repeaters to the spammers.

The researchers’ botnet consisted of 500 repeaters, 2300 spammers, eight protectors and one CaC server, for a total of 2809 nodes.

Laboratory emulations need to meet several criteria, the researchers said. They must be secure, so malicious code isn’t accidentally released into systems outside the experiment. They need to be scalable and realistic, so they exhibit the same kind of characteristics of an in-the-wild botnet. And they need to be flexible so experiments can be repeated under a variety of conditions.

          “In a nutshell,” the researchers wrote, “[emulation] provides a greater verifiable realism than analytical models, simulation methods or small-scale emulations, while providing greater levels of control and safety, and presenting fewer ethical and legal problems than in-the-wild experimentation.”

About John P Mello Jr

John Mello is a freelance writer who has written about business and technical subjects for more than 25 years. He is frequent contributor to the ECT News Network and his work has appeared in a number of periodicals, including Byte magazine, PC World, Computerworld, CIO magazine and the Boston Globe
  • (required)
  • (required)