Jon Stewart
Joel Uckelman

Abstract

This paper discusses problems arising in digital forensics with regard to Unicode, character

encodings, and search. It describes how multipattern search can handle the different text

encodings encountered in digital forensics and a number of issues pertaining to proper

handling of Unicode in search patterns. Finally, we demonstrate the feasibility of the

approach and discuss the integration of our developed search engine, lightgrep, with the

popular bulk_extractor tool.