Gene-level complexity explains genome-wide variation in the distribution of fitness effects
The distribution of fitness effects (DFE), describing how harmful, neutral, or beneficial new mutations are, is central to understanding how populations evolve. Although the DFE varies across genomes and species, it remains unclear which aspects of genomic organization drive this variation. Here, we inferred gene-level selective constraints across the genomes of Mus musculus castaneus, Drosophila melanogaster and Saccharomyces cerevisiae using a combination of population genetics and machine learning trained on diverse gene features. Many gene features were predictive of selective constraint, with conservation, gene structure, and expression being the most informative. These selective constr