Saturday, 26 April 2008

Doing my duty - Malicious activity detection

Linksys WRT54G
For several years I have been using a Linksys WRT54G V1.1 wireless router with the official firmware attached to my cable modem. It has a built in firewall but lacked any proper network intrusion detection system it did have a rudimentary log which could be accessed via the web-page interface but that was pretty much useless when trying to look for malicious activity such as denial of service attacks, port scans and attempts to crack into the network via vulnerabilities.

It always worried me as to what attempts were being made to get past the firewall on to my network. Some of my network PCs have software firewalls as a backup and alerts have been almost non-existent but it still nagged away. Last year I toyed with taking an old PC and creating a Smoothwall firewall, because of the logging and ability to install Snort but I really could justify the space and expensive of having another PC on 24/7.

As I posted yesterday I finally got the nerve up to install one of the numerous third party replacement firmwares for the router. Plumping for the Tomato variety!

One of the first things I noticed was the improved logging, which can be sent to an external PC running a monitoring/analysis program. In the wikibooks page it mentioned the WallWatcher software for Windows so I downloaded and installed it, configured the router and lo and behold I was getting information about all those Chinese TCP/IP packets bombarding my router!

I have now signed up and have installed the necessary client software to upload the logs to the SANS Institute Internet Storm Centre DShield system and myNetWatchman systems. These organisations use volunteers who submit their data to help detect problems and analyse threats, creating technical information and alerts to the general public.

The system works by having a network of hundreds or thousands of people from all over the world submitting information from their firewalls and intrusion detection systems about unwanted traffic arriving from the Internet. This data feeds the appropriate database where analysis is made looking for abnormal trends and behaviour. In the case of DShield the resulting analysis is posted to the ISC's main web page where it can be automatically retrieved by simple scripts or can be viewed in near real time by any Internet user.

I really feel like I am doing something good, and of course it is fairly geeky! These are the 'attacked' ports from today!

My pie chart

Labels: , , , , , , , , ,

Bookmark and Share

Wednesday, 12 March 2008

Can data overload protects privacy?

Photobucket

Privacy advocates are probably foaming at the mouth with the shocking revelation that in June 2006 all conversations on the MSN instant messaging system were being collected and passed to researchers Eric Horvitz and Jure Leskovec at Microsoft Research.

They claim they weren't interested in the content of the messages but were simply investigating the behaviour of a 'planetary scale system'.

There is nothing earth shattering about the results, they show people are more likely to chat with others in the same geographical location, age group and of the same sex. But as us pointed out by the arXiv blogger the most interesting aspect of the research is the fact the researchers struggled to cope with the size of their dataset.
"The dataset consisted of 30 billion conversations generated by 240 million distinct users over one month. We found that approximately 90 million distinct Messenger accounts were accessed each day and that these users produced about 1 billion conversations, with approximately 7 billion exchanged messages per day."
"The sheer size of the data limits the kinds of analyses one can perform,"

"Each day yielded about 150 gigabytes of compressed text logs (4.5 terabytes in total). Copying the data to a dedicated eight-processor server with 32 gigabytes of memory took 12 hours. Our log-parsing system employed a pipeline of four threads that parse the data in parallel, collapse the session join/leave events into sets of conversations, and save the data in a compact compressed binary format. This process compressed the data down to 45 gigabytes per day. Processing the data took an additional 4 to 5 hours per day."
For years now various security services around the world have made moves to assemble databases of online communication. They want to watch over phone calls, social networking sites and emails. But extracting useful information, not just generalities like the study mentioned, is going to require massive amounts of storage and processing power.

But to quote the arXiv blogger
So will data overload always protect us from Big Brother’s prying eyes? Perhaps in some circumstances like these but otherwise I wouldn’t count on it. It’s straightforward to sample big datasets like this (although that can introduce problems of its own).

I wouldn’t mind betting that with a little more effort, it would be possible to identify individuals from their travel and chatting patterns, perhaps by correlating the data with local telephone and business directories much in the same way this has been done with search data. However, it looks as if Horvitz and Leskovec have steered carefully around this issue.

Of course, Microsoft doesn’t need to do this since it can store a much fuller set of data anyway including the full text of the conversations and whatever data it has on the identity of the owners.

And you can be sure that more shadowy organisations with access to much greater computing resources will also have this full data set and be happily chewing through it as you read this.

Labels: , , , ,

Bookmark and Share