Design Proxy Server to block websites

Apr 14, 2011 at 1:23pm
Hello

As a part of our N/W course assignment we are to implement a proxy server which should be able to restrict access to some websites.

As a part of the implementation, I am reading the sites to be blocked from a text file and storing them in an array. When I get a http request packet I parse the packet to get the value in the Host Name field and compare it with the array. If I find a match I simply block it else I forward it.

A loop hole with this implementation is if any one enters the ip-address of the blocked sites the packet gets forwarded. To avoid this, during the creation of the array am using
gethostbyname()
to get the ip address of the server to be blocked and will compare the requested server-address with this list of ip's also.

However, the problem began when I started to try to block www.facebook.com
I stored www.fb.com as one of the sites to be blocked in the text file and also will get its ip address during the creation of the array. Now when some one enters www.facebook.com, it will be allowed becoz i have blocked ony fb.com

As I understand, I now have to use gethostbyname for the requested server address also and then compare the ips obtained with those in the array and then decide whether to forward or not.

However this comparison is time consuming and it would be nice if anyone could suggest some other implementation for the same.

Also if possible please gimme links where I can read up more on how blocking can be achieved.

Thanks
Last edited on Apr 14, 2011 at 1:26pm
Apr 14, 2011 at 2:45pm
You don't have to use gethostbyname, if you look at an HTTP packet, it will give you the text address before dns of the site that is being accessed. A ton easier than finding the IP of all of the sites.
Apr 14, 2011 at 4:06pm
@ultifinitus - can you be a little more specific please
Apr 14, 2011 at 5:23pm
HTTP protocol site:
http://www.w3.org/Protocols/rfc2616/rfc2616.html

if you look at a simple packet, the first line being something like a GET or a POST request. The second line will have HOST: then right here it will tell you the host that the user is trying to reach.
Apr 15, 2011 at 4:45pm
ya but wat do i do if it is an ip address ??
Topic archived. No new replies allowed.