Abstract
Computer science experimentation, whether it be for safety, reliability or cybersecurity, is an important part of scientific advancement. Evaluation of relative merits of various experiments typically requires well-calibrated benchmarks that can be used to measure the experimental results. This paper reviews current trends in using benchmarks in fuzzing experimental research for cybersecurity, specifically with metrics related to coverage analysis. Strengths and weaknesses of the current techniques are evaluated and suggestions for improving the current approaches are proposed. The end goal is to convince researchers that benchmarks for experimentation must be well documented, archived and calibrated so that the community knows how well the tools and techniques perform with respect to the possible maximum in the benchmark.