Bilal Shebaro
Jedidiah Crandall (University of New Mexico)

Abstract

Network flow recording is an important tool with applications that range from legal compliance and security auditing to network forensics, troubleshooting, and marketing. Unfortunately, current network flow recording technologies do not allow network operators to enforce a privacy policy on the data that is recorded, in particular how this data is stored and used within the organization. Challenges to building such a technology include the public key infrastructure, scalability, and gathering statistics about the data while still preserving privacy. We present a network flow recording technology that addresses these challenges by using Identity Based Encryption in combination with privacy-preserving semantics for onthe-fly statistics. We argue that our implementation supports a wide range of policies that cover many current applications of network flow recording. We also characterize the performance and scalability of our implementation and find that the encryption and statistics scale well and can easily keep up with the rate at which commodity systems can capture traffic, with a couple of interesting caveats about the size of the subnet that data is being recorded for and how statistics generation is affected by implementation details. We conclude that privacy-preserving network flow recording is possible at 10 gigabit rates for subnets as large as a /20 (4096 hosts). Because network flow recording is one of the most serious threats to web privacy today, we believe that developing technology to enforce a privacy policy on the recorded data is an important first step before policy makers can make decisions about how network operators can and should store and use network flow data. Our goal in this paper is to explore the tradeoffs of performance and scalability vs. privacy, and the usefulness of the recorded data in forensics vs. privacy