Web traffic analyser
  • Summary

    Industry Financial services sector

    Client One of India's largest financial institutions

    Requirement A security system for monitoring of Web traffic hitting the 100+ Web servers of the organisation, with deep analysis of traffic. This analysis must identify certain types of application-specific suspicious behaviour even in the absence of actual intrusion or breach incidents. The analysis must also provide a picture of Web activity by correlating multiple sources of data streams which normal off-the-shelf tools do not correlate. This data will be used for forensic analysis and alerting about suspicious behaviour.

  • Our solution

    Business benefits All the benefits are in terms of greater insight into user behaviour, either for post facto forensic analysis of web activity after an incident, or for alerts generated when unusual behaviour is seen. The following are examples of questions which can be answered after deploying our Web traffic analyser:

    • Where was the user account for User XYZ created from? (By "where", it refers to the physical location, type of computer, and IP address.)
    • Full details of the HTTP requests made using which User XYZ triggered a funds transfer from his bank account?
    • Which users logged in from locations far from their normal location of operation in the last seven days?
    • If three bank accounts are showing suspicious data, is there any clustering of the recent accesses to these accounts in time and location? For instance, did they all get accessed from the same computer in Dadar West or Dar-es-Salaam on the same day?

    Normal application-maintained logs and HTTP server logs do not permit such questions to be answered. One of the most important pieces missing from the puzzle is usually the identity of the user performing HTTP requests -- this identity is normally very hard to correlate with the log of HTTP requests.

    Architecture Our solution includes input modules which pull out data from three sources:

    • Apache access logs: One line of information for each HTTP request, containing all the standard log columns plus many more optional attributes which Apache support
    • HTTP request/response: The complete contents of each HTTP request and the header of each HTTP response are logged
    • jsession: Information from the servlet container capturing the username and session-ID cookie from the session object whenever a new HTTP session is started. Unauthenticated sessions may not carry usernames.

    These three types of information are logged from each Web server and application server and transferred every few hours to a central logging server. Here, the HTTP requests and responses are parsed, Apache access logs are parsed, and all data is processed and merged into a normalised table structure and uploaded into a database. The details of HTTP requests are clubbed into sessions, and each session is tagged with a username if that information is available from jsession logs.

    There is a UI which can be used by administrators to look at a set of standard reports (generated daily) and perform ad hoc drill-down queries. Drill-down exploration can be done by username, by physical location ("all accesses from Vile Parle or Chanakyapuri on the morning of Friday"), by application type and details ("all funds transfers to a particular account on Friday"), and then other facets of each session and each HTTP request can be explored.

    Deployment Deploying this system involves adding modules into Apache and the application servers (e.g. Tomcat) to capture different types of data. It also involves a large central log server to aggregate all the data, process it, and maintain structured records in a database. The system has been deployed in about eight Web servers and their corresponding application servers, and further testing and deployment is in progress.