The origin of novel protein-coding genes de novo (Genes originated from non coding sequences) was once considered to be improbable as to be impossible. In less than a decade, and especially in the last five years, this view has been overturned by extensive evidence from diverse eukaryotic lineages. There is now evidence that this mechanism has contributed a significant number of genes to genomes of organisms as diverse as Saccharomyces, Drosophila, Plasmodium, Arabidopisis and human (Mclysaght and Guerzoni, 2015).  From simple beginnings, these genes have in some instances acquired complex structure, regulated expression and important functional roles. The origin of genes can involve mechanisms such as gene duplication, exon shuffling, retroposition, mobile elements, lateral gene transfer, gene fusion/fission, and de novo origination. However, de novo origin, which means genes originate from a non-coding DNA region, is considered to be a very rare occurrence.

 

            In order for non-coding DNA to begin to function as a protein-coding gene, an ORF must originate, the DNA must be transcribed and the mRNA translated, and the protein should ultimately become integrated into the cellular processes. Though it is tempting to think of this as a stepwise, directional process, the evidence from yeast and from flies is that there is a reversible evolutionary continuum from non-gene to gene. There are two major approaches to the systematic identification of novel genes: genomic phylostratigraphy and synteny-based methods, both approaches are widely used as individually or in a complementary fashion.

 

            De novo gene is expressed in at least some context, allowing selection to operate, and many studies use evidence of expression as an inclusion criterion in defining de novo genes. The expression of sequences at the mRNA level may be confirmed individually through conventional techniques such as quantitative PCR, or globally through more modern techniques such as RNA sequencing (RNA-seq). Similarly, expression at the protein level can be determined with high confidence for individual proteins using techniques such as mass spectrometry or western blotting, while ribosome profiling (Ribo-seq) provides a global survey of translation in a given sample (Mclysaght and Guerzoni, 2015).

 

            Wenfei et al. (2009) reported de novo origin of gene (OsDR10) in rice which negatively regulates pathogen-induced defense response. The evolution study of de novo gene revealed homologs only in the closest relative of Leersia genus, but not other subfamilies of the grasses. Further they confirm that this gene may evolve a highly conservative rice-specific function that contributes to the regulation difference between rice and other plant species in response to pathogen infections. Studies from biologic analyses including gene silencing, pathologic analysis and mutant characterization by transformation showed that the de novo gene (OsDR10) suppressed plants enhanced resistance to a broad spectrum of Xanthomonas oryzae pv. oryzae strains, which cause bacterial blight disease. Studies from above analyses provide fresh insights into the new biologic and evolutionary processes of a de novo gene recruited rapidly in rice species or population.

 

            Dong et al. (2011)  identified  60 new protein-coding genes that originated de novo on the human lineage since divergence from the chimpanzee. The functionality of these genes is supported by both transcriptional and proteomic evidence. RNA– seq data indicate that these genes have their highest expression levels in the cerebral cortex and testes, which might suggest that these genes contribute to phenotypic traits that are unique to humans, such as improved cognitive ability.

 

             The discovery of de novo genes is more than simply a discovery of a set of genes in eukaryotic genomes, it is the discovery of the viability of this process that can release genomic variation for testing through the filter of natural selection and is extremely important for the divergence and adaptation of organisms. De novo genes are not only important for their functional and biological contribution to the lineages in which they originate; they are also very informative in terms of our growing understanding of the evolution the genome and of new gene functions. 

 

References:

 

Dong, D. W., David, M. I., Ya, P. Z., 2011, De Novo Origin of Human Protein-Coding Genes. PLoS7 (11) : 1-9

Mclysaght, A. and Guerzoni, D., 2015, New genes from non-coding sequence: the role of de novo protein-coding genes in eukaryotic evolutionary innovation. Phil. Trans. R. Soc. B., 370 (1): 1-7.

Wenfei, X., Hongbo, L., Yu, L., Xianghua, L., Caiguo, X., Manyuan, L. and Shiping, W., 2009, A rice gene of de novo origin negatively regulates pathogen-induced defense response. PLoS, 4 (2): 1-12.