Skip to content

crakshay1/HelixFlor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 

Repository files navigation

Helixflor

Python R SQL

🔍 Context

Although high-throughput sequencing provides access to complete genomes, the structural annotation of genes in these genomes still remains a key step, especially in plants that have complex genomes (polyploidy, numerous transposable elements). The recent application of deep learning in annotation tools will surely make it possible to go faster in proposing annotations on both the structural and functional sides.

Objective of the Internship

The GBOT database contains 6 plant genomes, all of which contain the official annotation of these genomes, plus for some, the annotation generated after the use of Helixer (Stiehler et al. 2021), an annotation tool that combines deep neural networks and HMM-type models to predict gene models from the genomic sequence alone. The internship consists in :

  1. Applying and understanding Helixer on genomes not yet annotated

  2. Doing a global comparison on each genome (A comparison was already done before.)

  3. Targeting new genes defined by Helixer and highlight their characteristics on the structural side (gene size, number of exons) and functional side (generated protein and functional annotation)

  4. Targeting genes corresponding to known genes but without 5’UTR, and taking stock of the properties of these genes, checking if TATA-box near the new annotated UTR

⚠️ Special thanks to Franck SAMSON, my internship tutor.

About

This repository contains some code snippets and the results of my internships revolving around GBOT, a genome browser.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages