Polymorphism Analysis Reveals Reduced Negative Selection and Elevated Rate of Insertions and Deletions in Intrinsically Disordered Protein Regions

Khan T, Douglas GM, Patel P, Nguyen Ba AN, Moses AM

Genome Biol Evol 2015 Jun;7(6):1815-26

PMID: 26047845

Abstract

Intrinsically disordered protein regions are abundant in eukaryotic proteins and lack stable tertiary structures and enzymatic functions. Previous studies of disordered region evolution based on interspecific alignments have revealed an increased propensity for indels and rapid rates of amino acid substitution. How disordered regions are maintained at high abundance in the proteome and across taxa, despite apparently weak evolutionary constraints, remains unclear. Here, we use single nucleotide and indel polymorphism data in yeast and human populations to survey the population variation within disordered regions. First, we show that single nucleotide polymorphisms in disordered regions are under weaker negative selection compared with more structured protein regions and have a higher proportion of neutral non-synonymous sites. We also confirm previous findings that nonframeshifting indels are much more abundant in disordered regions relative to structured regions. We find that the rate of nonframeshifting indel polymorphism in intrinsically disordered regions resembles that of noncoding DNA and pseudogenes, and that large indels segregate in disordered regions in the human population. Our survey of polymorphism confirms patterns of evolution in disordered regions inferred based on longer evolutionary comparisons.