This webpage lists open-source projects developed by the Computer Security Group at the University of Göttingen, Germany. Further information on the research group are available on the official group webpage.
Harry is a tool for comparing strings and measuring their similarity. The tool supports several common distance and kernel functions for strings as well as some excotic similarity measures. The focus lies on implicit similarity measures, that is, comparison functions that do not give rise to an explicit vector space. Examples of such similarity measures are the Levenshtein and Jaro-Winkler distance.
Letter Salad, or Salad for short, is an efficient and flexible implementation of the anomaly detection method Anagram. The method uses n-grams (substrings of length n) maintained in a Bloom filter for efficiently detecting anomalies in large sets of string data. Salad extends the original method by supporting n-grams of bytes as well n-grams of words and tokens. (Paper)
Sally is a small tool for mapping a set of strings to a set of vectors. This mapping is referred to as embedding and allows for applying techniques of machine learning and data mining for analysis of string data. Sally can applied to several types of string data, such as text documents, DNA sequences or log files, where it can handle common formats such as directories, archives and text files. (Paper)
Joern is a tool for robust analysis of C/C++ code. It generates abstract syntax trees, control flow graphs and searchable indexes of code constructs, even for code that does not compile due to missing headers. As such, it has been specifically designed to meet the needs of code auditors, who often find themselves in a situation where constructing a working build environment is not a feasible option or is simply impossible due to missing code. (Paper)
Malheur is a tool for the automatic analysis of program behavior recorded from malware. It has been designed to support the regular analysis of malware and the development of detection and defense measures. Malheur allows for identifying novel classes of malware with similar behavior and assigning unknown malware to discovered classes using machine learning. (Paper)
Prisma is an R package for processing and analyzing huge text corpora. In combination with the tool Sally the package provides testing-based token selection and replicate-aware, highly tuned non-negative matrix factorization and principal component analysis. Prisma allows for analyzing very big data sets even on desktop machines. (Paper)
Derrick is a simple tool for recording data streams of TCP and UDP traffic. It shares similarities with other network recorders, such as tcpflow and wireshark, where it is more advanced than the first and clearly inferior to the latter. Derrick has been specifically designed to monitor application-layer communication. In contrast to other tools the application data is logged in a line-based ASCII format. Common UNIX tools, such as grep, sed & awk, can be directly applied.