|Title||:||User Traffic Classification for Proxy Server based Internet Access Control|
|Speaker||:||Saad Yunus Sait (IITM)|
|Details||:||Fri, 1 May, 2015 2:00 PM @ BSB 361|
|Abstract:||:||Peak-hour congestion is a common problem faced by institutions accessing the In-
ternet. Most methods of control are node-based, so they fail when nodes are shared by
users and when mobile devices are used (changing IP address due to DHCP). Besides,
control is normally done by identifying ’rogue’ nodes which generate too much traffic and
placing them in a low priority queue when their quota is exceeded. This threshold-based
scheme is inaccurate because it places a hard threshold on the usage, whereas Internet
traffic is dynamic and access patterns vary from day to day.
We describe a two-pronged approach for the control of users. While users may be identified by a user-authenticating proxy, machine learning based techniques may be used to detect abusive usage. We have experimented with two techniques. The first one uses the gaussian mixture model (GMM) and the other a naive bayes (NB) classifier. The latter technique makes use of a maximum relevance minimum redundancy (mRMR) feature selection algorithm in order to detect characterizing features for the NB classifier. Results indicate substantial improvement in the classification of usage, with accuracies of upto 95%. We then show how these models may be used to characterize the traffic belonging to normal and abusive users. Our models have been trained using data extracted from proxy server logs.