My organization, like many which serves the public, hosts a guest network through which our clients, vendors, etc. can gain largely unfiltered Internet access. Unsurprisingly, there is a good portion of traffic in that environment that falls somewhere on the range of suspicious to outright eye-gougingly bad. While we track some of the worst of what goes on there (using it as a free honeynet of sorts), we don’t have the resources to do response to our own little slice of the general Internet. That hasn’t stopped us from asking questions about this data and how we can use this to further our work on our critical internal networks.

With a recent increase of visibility into our potential bad traffic, my team wondered how we could explain what led up to a particular indicator of compromise (IOC) and what actions followed the traffic that finally triggered the klaxons. Getting a list of source/destination/port network flows out of our Elasticsearch environment is easily managed. Finding the right visualization to allow an analyst to scroll through the connections over time and understand the events around an IOC has been a surprisingly difficult design challenge. I threw out a post to the securitymetrics.org mailing list and received a good number of responses, but nothing that was an obvious winning solution. There are a couple of options I’m currently exploring, explained below. Stay tuned for mock ups as I get through the always painful first few prototypes.

Chord Diagrams

An example can be found at http://bl.ocks.org/mbostock/4062006.We’ve used this format before to show connections to/from our location for a fixed period of time. This is great for big overviews, but doesn’t well account for including the time dimension. There’s some support under the igraph package that should make testing this out pretty easy to do, though.

Movies

We recently took some code by @jayjacobs to create time lapse movies of network traffic. That was a great visual for a wow effect, but lacks power to allow an analyst to easily play forward and backwards and get more than a general sense of activity. Movie encoding also requires generating a large number of frames, making an sizable period of time very computationally expensive to display.

BioFabric

Detailed at http://www.biofabric.org/, this novel approach to taking complex graphs and detangling them is very intriguing. This doesn’t intrinsically address the time dimension, but may be useful if we aggregate the monitored time periods into a reasonable number of buckets. While I’d like to work with this format and better understand its strengths and weaknesses, the code for R looks immature so this one may sit on the back burner.

Hive Plots

Detailed at http://www.hiveplot.net/, these multi-axis representations are also intresting. I’m uncertain how readable these will be with dense connections, but there is a sample implementation for R (https://github.com/bryanhanson/HiveR) which I want to try out.

Gantt Charts

As all of the above solution involve one form of aggregation and time compression, I got to sketching some traditional Gantt charts that take aggregated data and display summary connection information. This may wind up being the default format we use as it is easy to create, relatively easy to interpret, and presents multiple time slots in a single dense view. Now I just have to code it. ;)

So there we are. Five possible approaches. Are there other formats which I should be considering? Most certainly. Please chime in with any suggestions in the comments. Again, I fully intend to mock up several of these and post sanitized versions here for discussion on what worked for us. There are several projects in flight right now and I’m not a fast coder.