Abstract
Proteins are critical to the function of cells and to life. It is well established that changes to the DNA sequence (genotype) of a protein can have a significant impact on how they function or interact within the cell. Understanding the mapping between changes in a protein genotype and how those changes modify an organism phenotype is a largely unsolved problem in biology. Solving this problem will require integration of experimental methods with computational and mathematical approaches. In this thesis, we utilize both computational and mathematical methodologies. We start by using statistical methods to investigate potential physical features that can explain epistasis in proteins. Here we find a number of intuitive features that play a role, but we can only explain ~30% of the observed epistasis in both protein binding and folding. Next, we use molecular dynamics to inform statistical models and predict the spectral sensitivity of opsin proteins with high accuracy. Following that, we investigate a suite of fast methods for predicting protein-protein binding affinity, finding their performance to be largely context dependent. Lastly, we explore using two different molecular modeling techniques to calculate free energies and build a watch list of antibody escape mutations for the current COVID-19 pandemic.