(2020) Mining Input Grammars from Dynamic Control Flow.
|
Text
fse2020-mimid.pdf Download (746kB) | Preview |
|
Other (Plain Text Bibliography)
bibliography.txt - Bibliography Download (7kB) |
||
|
Text
fse2020-mimid.pdf Download (745kB) | Preview |
Abstract
One of the key properties of a program is its input specification. Having a formal input specification can be critical in fields such as vulnerability analysis, reverse engineering, software testing, clone detection, or refactoring. Unfortunately, accurate input specifications for typical programs are often unavailable or out of date. In this paper, we present a general algorithm that takes a program and a small set of sample inputs and automatically infers a readable context-free grammar capturing the input language of the program. We infer the syntactic input structure only by observing access of input characters at different locations of the input parser. This works on all stack based recursive descent input parsers, including parser combinators, and works entirely without program specific heuristics. Our Mimid prototype produced accurate and readable grammars for a variety of evaluation subjects, including complex languages such as JSON, TinyC, and JavaScript.
Item Type: | Conference or Workshop Item (A Paper) (Paper) |
---|---|
Additional Information: | artifact: https://github.com/vrtrha/mimid |
Divisions: | Andreas Zeller (Software Engineering, ST) |
Conference: | ESEC/FSE European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (formerly listed as ESEC) |
Depositing User: | Rahul Gopinath |
Date Deposited: | 09 Jun 2020 08:14 |
Last Modified: | 09 Sep 2020 14:54 |
Primary Research Area: | NRA4: Secure Mobile and Autonomous Systems |
URI: | https://publications.cispa.saarland/id/eprint/3101 |
Actions
Actions (login required)
View Item |