Lukina hw6 by MariaLuk · Pull Request #2 · MariaLuk/Junior_Bioinformatic_tool

MariaLuk · 2023-10-19T12:57:10Z

No description provided.

nvaulin

Привет!

Хорошая работа

Классный README
Будь чуть аккуратнее с сообщениями коммитов. Например, их принято делать с заглавной буквы:)
По чтению и записи FASTQ-файлов в фильтраторе кажется все написано правильно, но че-то к сожалению привел ввод который выдал мне пустой вывод. Не знаю что именно не так.
В целом работа хорошая, но вот везде какие-то мелкие косяки которые все портят)) В конвертере какой-то небольшой мусор влезает, тоже обрати внимание. В сдвиге позиции у тебя возвращается только первая буква правильного ответа)) В общем не забывай сама все это дело еще проверять перед отправкой, и тогда совсем хорошо будет.

Баллы

Добработка FASTQ-модуля: 1/2 балла
convert_multiline_fasta_to_oneline: 3.8/4 балла
select_genes_from_gbk_to_fasta: 0/4 балла

-0.5 за то что есть неправильные названия аргументов (не так как заявлено в ТЗ)

Итого: 4.3 баллов + 2 доп. балла (они будут стоять в отдельной колонке)

В любом случае, ты молодец! Нужно чуть больше внимательности и так понимаю больше времени, и тогда совсем все супер будет.

nvaulin · 2023-10-22T23:31:12Z

+This package consists of 2 mini-tools
+`tools_for_bioinformatics` designed to work with nucleotide and amino acid sequences and FASTQ files
+`bio_files_processor` provides opportunity to  manipulate with FASTA files  


🔥

Suggested change

This package consists of 2 mini-tools

`tools_for_bioinformatics` designed to work with nucleotide and amino acid sequences and FASTQ files

`bio_files_processor` provides opportunity to manipulate with FASTA files

This package consists of 2 mini-tools

`tools_for_bioinformatics` designed to work with nucleotide and amino acid sequences and FASTQ files

`bio_files_processor` provides opportunity to manipulate with FASTA files

nvaulin · 2023-10-22T23:32:04Z

+        Returns:
+            (str): oligo- and polynucleotide  sequence
+    """
+    return complement(seq)[::-1]


Suggested change

return complement(seq)[::-1]

return reverse(complement(seq))

nvaulin · 2023-10-22T23:35:13Z

+from typing import Union
+import os
+import modules.nucleic_acids_functions as na
+import modules.fastq_filters as ff
+import modules.amino_acids_functions as aa
+NUCLEOTIDES = {'U', 'A', 'g', 't', 'G', 'T', 'a', 'c', 'C', 'u'}


Suggested change

from typing import Union

import os

import modules.nucleic_acids_functions as na

import modules.fastq_filters as ff

import modules.amino_acids_functions as aa

NUCLEOTIDES = {'U', 'A', 'g', 't', 'G', 'T', 'a', 'c', 'C', 'u'}

import os

from typing import Union

import modules.nucleic_acids_functions as na

import modules.fastq_filters as ff

import modules.amino_acids_functions as aa

NUCLEOTIDES = {'U', 'A', 'g', 't', 'G', 'T', 'a', 'c', 'C', 'u'}

С импортами тоже куча своих конвенций. Например тут:

Отделяем от дальнейшего кода 2 строками

Сперва внешние, потом твои

Сперва import, потом from ... import

Ну и там далее еще куча разных, можно про это в PEP-8 прочитать

nvaulin · 2023-10-22T23:36:48Z

+        return answer
+
+
+def fastq_filtration(input_fastq, gc_bounds=(0, 100), length_bounds=(0, 2 ** 32), quality_treshold=0, output_fastq=''):


Функции глаголами

Пропущенные значения по умолчанию - None

Suggested change

def fastq_filtration(input_fastq, gc_bounds=(0, 100), length_bounds=(0, 2 ** 32), quality_treshold=0, output_fastq=''):

def run_fastq_filtration(input_fastq, gc_bounds=(0, 100), length_bounds=(0, 2 ** 32), quality_treshold=0, output_fastq=None):

или

Suggested change

def fastq_filtration(input_fastq, gc_bounds=(0, 100), length_bounds=(0, 2 ** 32), quality_treshold=0, output_fastq=''):

def filter_fastq(input_fastq, gc_bounds=(0, 100), length_bounds=(0, 2 ** 32), quality_treshold=0, output_fastq=None):

А еще в ТЗ первый аргумент это input_path, ну и + в слове treshold опечатка (quality_threshold) :)

nvaulin · 2023-10-22T23:37:20Z

+    """
+    if not os.path.isdir('fastq_filtrator_resuls'):
+        os.mkdir('fastq_filtrator_resuls')
+    if output_fastq == '':


Suggested change

if output_fastq == '':

if not output_fastq:

Ну а вообще с учетом прошло комментария:

Suggested change

if output_fastq == '':

if output_fastq is None:

nvaulin · 2023-10-22T23:43:25Z

+    return seqs
+
+
+def write_dict_file_to_fastq(seqs, output_fastq):


Suggested change

def write_dict_file_to_fastq(seqs, output_fastq):

def write_fastq(seqs, output_fastq):

nvaulin · 2023-10-22T23:43:43Z

+    with open(output_fastq, 'w') as output_file:
+        for key, params in seqs.items():
+            output_file.write(key + '\n')
+            output_file.write(params[0] + '\n')
+            output_file.write(params[1] + '\n')
+            output_file.write(params[2] + '\n')


nvaulin · 2023-10-22T23:54:07Z

+        output_fasta = os.path.join('multiple_to_online_results', os.path.basename(input_fasta))
+    else:
+        output_fasta = os.path.join('multiple_to_online_results', output_fasta + ".fasta")
+    with open(input_fasta) as input_file, open(output_fasta, 'w') as output_file:


nvaulin · 2023-10-22T23:55:37Z

+        current = []
+        output_file.write(input_file.readline())
+        while True:
+            line = input_file.readline()
+            current.append(line.strip())
+            if line.startswith('>'):
+                output_file.write(''.join(current) + '\n')
+                output_file.write(line)
+                current = []
+                break
+
+        for line in input_file:
+            if line.startswith('>'):
+                output_file.write(''.join(current) + '\n')
+                output_file.write(line)
+                current = []
+            else:
+                current.append(line.strip())
+        output_file.write(''.join(current) + '\n')


Вроде конечно все выглядит ок и хорошо. И вообще здорово что ты так все в одном решила написать. Но...
У меня тут при тестировании какой-то небольшой мусор в первой записи выводится:)

nvaulin · 2023-10-22T23:57:02Z

+    new_fasta = s1 +s2
+    with open(output_fasta, 'w') as output_file:
+        output_file.write(name +'\n')
+        output_file.write(new_fasta[0] + '\n')


Вот всё тут супер, но этот [0] оставляет лишь первую букву)))

MariaLuk added 10 commits October 8, 2023 12:17

update README file

f2730a1

Add tool_for_bioinformatics main script

ff83160

Add directory modules with functions for sequenses

4235fdd

correct tool_for_bioinformatics file

61a6eb5

add new script bio_processor_tools

7ff376c

Update fastq_filtration for FASTQ files

539595e

updated_bio_file_processor

4fab426

updated tool_for_bioinformatics

50ee5e3

update README

dd27c63

update tool_for_bioinf: correct import of FASTQ_functions (after dl)

e7b1bf4

nvaulin reviewed Oct 23, 2023

View reviewed changes

MariaLuk added 2 commits February 23, 2024 12:42

Correct fastq filtration

ba201c5

Delete cash

2fe1b37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lukina hw6#2

Lukina hw6#2
MariaLuk wants to merge 12 commits intomainfrom
lukina_hw6

MariaLuk commented Oct 19, 2023

Uh oh!

nvaulin left a comment

Uh oh!

nvaulin Oct 22, 2023

Uh oh!

nvaulin Oct 22, 2023

Uh oh!

nvaulin Oct 22, 2023

Uh oh!

nvaulin Oct 22, 2023

Uh oh!

nvaulin Oct 22, 2023

Uh oh!

nvaulin Oct 22, 2023

Uh oh!

nvaulin Oct 22, 2023

Uh oh!

nvaulin Oct 22, 2023

Uh oh!

nvaulin Oct 22, 2023

Uh oh!

nvaulin Oct 22, 2023

Uh oh!

nvaulin Oct 22, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		return answer


		def fastq_filtration(input_fastq, gc_bounds=(0, 100), length_bounds=(0, 2 ** 32), quality_treshold=0, output_fastq=''):

	def fastq_filtration(input_fastq, gc_bounds=(0, 100), length_bounds=(0, 2 ** 32), quality_treshold=0, output_fastq=''):
	def run_fastq_filtration(input_fastq, gc_bounds=(0, 100), length_bounds=(0, 2 ** 32), quality_treshold=0, output_fastq=None):

	def fastq_filtration(input_fastq, gc_bounds=(0, 100), length_bounds=(0, 2 ** 32), quality_treshold=0, output_fastq=''):
	def filter_fastq(input_fastq, gc_bounds=(0, 100), length_bounds=(0, 2 ** 32), quality_treshold=0, output_fastq=None):

		return seqs


		def write_dict_file_to_fastq(seqs, output_fastq):

	def write_dict_file_to_fastq(seqs, output_fastq):
	def write_fastq(seqs, output_fastq):

Conversation

MariaLuk commented Oct 19, 2023

Uh oh!

nvaulin left a comment

Choose a reason for hiding this comment

Баллы

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants