Skip to content

Hw 18#4

Open
MariaLuk wants to merge 14 commits intoHW14_newfrom
HW_18
Open

Hw 18#4
MariaLuk wants to merge 14 commits intoHW14_newfrom
HW_18

Conversation

@MariaLuk
Copy link
Copy Markdown
Owner

@MariaLuk MariaLuk commented May 1, 2024

This is finalized and re-organized repo with my IB Python Course HW

Copy link
Copy Markdown

@nvaulin nvaulin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Привет!

Отличная работа. README прям то что надо, четко и по делу.

Отдельный лайк за то что внесла исправления из предыдущих ревью!

По поводу случайного леса. Все хорошо, ты делала все правильно. Понятно, не хватает predict, но я бы еще отметил что все таки сам класс надо было бы вынести в скрипт. А ноутбуки чисто для примеров. Можешь открыть diff совего пулл-реквеста на гитхабе и посмотреть как отображается код в ноутбуке. Как минимум - его сильно сложне верьюить:) Ну и хочется чтобы у вас была привычка писать код в скриптах

По текстам все супер, идеи очень хорошие. Только небольшие замечания по оформлению:)

Из-за того что predict это равноценная часть с fit, то плучилось как-то сильно повлияло на оценку.

Баллы: 15/25 (RandomForest) + 10/10 (Тесты) + 14/15 (Репозиторий и оформление кода) = 39

Молодец!

Comment thread README.md
`bio_files_processor` provides opportunity to manipulate with FASTA files
## My first bioinformatics tools

This repository contains homeworks assignments from the Python course within the "Bioinformatics for Biologists" annual program at the Institute of Bioinformatics (2023-2024). It contains python realizations of a number of training tasks
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This repository contains homeworks assignments from the Python course within the "Bioinformatics for Biologists" annual program at the Institute of Bioinformatics (2023-2024). It contains python realizations of a number of training tasks
This repository contains homeworks assignments from the Python course within the "Bioinformatics for Biologists" annual program at the Bioinformatics institute (2023-2024). It contains python realizations of a number of training tasks

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(И сокращенно по английски "BI")

Comment thread bio_files_processor.py
Comment on lines +14 to +15
class OpenFasta:
def __init__(self, file_path: str):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Тут кстати тоже можно было бы добавить докстрингу, этому классу

Comment thread data/sequence.py
@@ -0,0 +1 @@
SEQ = " CTTTTTCGTCTGGGCTGCCAACATGGTAGGTGTTTCGTTTCTTGCCTCCTCTTCCTTGCCGGCGGAGACCCCTAAGCTGTATTCCCATTGCCCCTAGTCATCCACTCCCTACCATGGTCGGGGCTTCCAGGCTGCGCATGGCCGCCTGCGGGGCAGGGTGGCCGGCGCGGGCCCGGGGCGGGGCTCCCGGAGCCGTGTGTTAGGCCCGCGGTTCGGATCTCTAGGACACGCGGGCCCCTGCGCTACCGTGGTGAGACCTCACGGCCCTGAGCGGATCGGTACCCTCAGCTTTCCCAAACGCTCCAGAAGTTAGGTCTTTGACCCACAGGCTTACAGGACCATCTCGGCTGGCGGGCATCGCCCCCTGCCCCTAATTCCTTAGGCCTTACCACCAAGCTTTTTCCACACAGCCATCCAGACTGAGGAAGACCCGGAAACTTAGGGGCCACGTGAGCCACGGCCACGGCCGCATAGGTAAGTGCCGGCTTCCCCTCGGGGTGGGCCTTGGGCTCTCTTCGGGTGCTTAGCTAGTCTGGAGATCGGTAGCCTATAAGTGGGTTAGAATAAGACCTTTTTGTGGTCAAGTTGCACAGCTGTTGATTTTTTTCTGACGATCCTCTAGTATTCCAGTTCTAAGGAATTTCACATCAGTGGGGTAATAGGAATTGAGCAGGCACGGTATTGGGTTAGTTGAAGACATGGAGTACTGTGGGAATGCTGTGATGTGGAACCTGAAAAGATGTTTCACCCGGAATCCTAAAGTAATCGCATTGCTGAAAACCGGCATCGGTAGGGTGGGAACAGCGTAAGCGGGACACAGAAGTCTGGGAAACACTCTGCTTTTGTGCGAGGAAGTATTGAGATGCATGAGAAGGCTGTGTGGTGCATGTAGCTTTTTTGTGTGTGTGTGAGACTGATCACTGTCGCCCAGGCTGGAGTGCAGTGGCGAAATCTCGGCTCACTGCAACCCCTGCCTGCCGGGCTCAAGCGATTCTCCTGCCTCAGCCTCCCGAGTAGCTGATATTACAGGTGTGCGCCACGACGCCCGGCTAATTCTTTTTCTATTTTTAGTAGAGTCGGGGGTTTCTCCGTGTTGGCCAGGCTGGTCTCGAACTCCTGACCTCAGGTGATCTACCCCCCCTCGTCCTCCTAAAATGCTGGGATTACAGGCATGAGCCATCACACCGGCCCCACGTAGCTTTGTATTCCTGCAGGCAAGCACCGGAAGCACCCCGGCGGCCGCGGTAATGCTGGTGGTCTGCATCACCACCGGATCAACTTCGACAAATAGTAAGTGTCCTTGGACTGCTTTTATTGACACAGCTTGGGAGGTAGGGGCAGAGAGAGGGCTGGCTTAAACAAAAAGTTTAGAAGCAAGCCTTGCCTATTGCTGTTTTTTACCAAGTTAACACTTGGTGTGAACTGAGAACCTGTCATCGAGGCTAGAGTCACGCTTGGGTATCGGCTATTGCCTGAGTGTGCTAGAGTCCTCGAAGAGTAACTGCTGACCTTATTCACTGGCTGTGGGCCTTATGGCACAGTCAGTCACCAGGTTAGAGACATGCTTCACATTCACCTACCCACAAACTAGTGGATGATAAATTTTGGCTATTCAGAAGACGTTTATTATAGGAGTATGTAGATTTTCCATAGAGTGCTGTTATGTGACTTGAATTTTAGTCTCGGCCCTGCCTCTGACATTGTCGGTGGTTTATCCTGGTTCCAGGAAATAAGACTAGCCTTTTCCTCATGATAGTCTTTGGTGGTTTTTAAAACAGTTGTTTAAGTCAACAGATGTATCATATGCCTGACACTGCTCTACACCAGTGAATAATTTACACTCTAATAGGGGGTGGTAACTATAAAGATGATAAACATAGCATCTTAATTGGAGTGTGTATGAAGGTGGTTGTTACCTCTTCCTAGCCACCCAGGCTACTTTGGGAAAGTTGGTATGAAGCATTACCACTTAAAGAGGAACCAGAGCTTCTGCCCAACTGTCAACCTTGACAAATTGTGGACTTTGGTCAGTGAACAGACACGGGTGAATGCTGCTAAAAACAAGACTGGGGCTGCTCCCATCATTGATGTGGTGCGATCGGTAAGTTAATTGGATGTTTTTCTGTACTTCCATACCTTCCCTTACAAAACTCTGGCTTAATCTAATCCACTTATATAATCTGTACTTCCCAGTTACCTACCAGACATTGATATTCTTCCTGTGGTAGAATTATCATAGGTAGTTCCCTATCCGTAGCAGTGCCTACTGTCACTGCCCAGGTTGTATCAGGTTTGCATTTCGTGCTTGAACTATAGCTGGTTTTCACTGAGCACAGCTCTTGGCCCTTCATGTTCTCCAGATAATAGAATCCTAATATGTTCCATTGATACTCAGTGCCATGCATTATCTGAAGAGATTTTCCCCCAAAACAGATGTATTATGTCTGTCCTTGCGGGGGTTCTGGTCCCTGTGTCAGTCTTAACTCTCATGAATATAGAGGTAGTGTTAAGAGGCCAGAACCCTAGGGACGCTTTAAATTCACTTCCCAGCCTATTTAATGTCCATTGAGTAGTTCTGGTGGTCAGGAAGGTGGTTGTCTTCTTTTGCTTAGCAGGGGGTATTTGAGCAGGAGGAGGCTTATGCTTTGCCGAGACTAGAGTCACATCCTGACACAACTCTTGTCCTGGTGTGCTAGAGTACTCGAAGAGAATCTACTGGTCTTGATTCACTGGTGGGGGCAGTCGGTGCCCCCGTTAGTGCCCAGATCAGAAACATACATACCCTGCCTAGGGATTTAGAAAGTGGGTTGGCAGTCTTTCCTCACGCCCATCACGCAGTTGGTACCTACTACAGTGTATTGTAAACTTTTTTCTCTGTTCTTCTAGGGCTACTACAAAGTTCTGGGAAAGGGAAAGCTCCCAAAGCAGCCTGTCATCGTGAAGGCCAAATTCTTCAGCAGAAGAGCTGAGGAGAAGATTAAGAGTGTTGGGGGGGCCTGTGTCCTGGTGGCTTGAAGCCACATGGAGGGAGTTTCATTAAATGCTAACTACTTTTTCCTTGTGGTGTGAGTGTAGGTTCTTCAGTGGCACCTCTACATCCTGTGTGCATTGGGAGCCCAGGTTCTAGTACTTAGGGTATGAAGACATGGGGTCCTCTCCTGACTTCCCTCAAATATATGGTAAACGTAAGACCAACACAGACGTTGGCCAGTTAAACATTTCTGTTTATAAAGTCAGAATAATACCTGTTGATCACTGAAAGGCCTGCATGTATTGTACTCTGAATTTTACAGTGAATGAGAGAATGTACCCTAATTGTTCAACAGGGCTCAAAAGGAAAGATTCCATTTTGATGGGTCACATTCTAAAGAGGGGCAGTGTGATAGGAATGAGATGGTCCTTTAGGACTTAAGTTCTCAGCCCAAGGTTTTTCCACGTGGCCCCCTCATCTTTTTTTTTTTTTTAAACGGAGTCTCTCTTGCCAGGCTGGAGTGCAGTGGCACGATCTCGGCTCACTGCAGCCTCCGCCTCCCAGGTTAAGCGATTCTCCTGCCTCAGCTTCCTGACTAACTGGGATTACAGGCGCCCACCACCATGCCCAGCTAATTTTTGTATTTTCAGTAGAGATGGGGTTTCACCATGTTGGCCATGCTGGTCTCTAACTCCTAACCTCAAGTGATCTGCCCACATCGGCCTCCAAAAGTTCTGGGATTATAGTGTGAGCCACTGCGCCCGGCCATGGCTCCTTAATCTTGATCCAAATTATTGTTACATCCAGAATGTGATGAATCAAAATCTCGAGATGGGGGTCCAGCAATCTGAAATTTCAGTATGCCAGGGCTTTTCTGTATGTCAAAGTGGGTTTGAAATAGTTAATTTTTCTTCTAGTCTGAAATGTATCGGGAAAATTTGGAAATCCTGAAGGCTGGAAATTGAAATAAGTTTTTCTAGGATTTGTGTCTCTTGCTATTGGAAAACTGATGGTGACCA" No newline at end of file
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Тут наверное не надо делать её в капсе

" self.trees = []\n",
" self.feat_ids_by_tree = []\n",
"\n",
" def fit_single_tree(self, i):\n",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

По логике все супер. Я бы еще добавил одно нижнее подчеркивание в названиях методов fit и predict которые работают с 1 деревом. Чтобы подчеркнуть что это что-то внутренне, а публичный интерфейс это классические fit и predict

Comment on lines +62 to +63
"\n",
" def fit_single_tree(self, i):\n",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i обычно если и назыают переменную, то это же обычно какой-нибудь счетчик. А тут это же скорее, например:

Suggested change
"\n",
" def fit_single_tree(self, i):\n",
"\n",
" def fit_single_tree(self, tree_id):\n",

Comment thread requirements.txt
Comment on lines +1 to +5
beautifulsoup4==4.12.3
Bio==1.7.0
pytest==8.2.0
python-dotenv==1.0.1
Requests==2.31.0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Огонь!

Comment thread test_module.py
Comment on lines +33 to +45
class TestDNASequences:
def test_dna_alphabet(self):
dna = DNASequence('')
target = set('ATGCatgc')
result = DNASequence.get_alphabet(dna)
assert target == result


def test_gc_content(self):
dna = 'ATGC'
target = 50.0
result = NucleicAcidSequence(dna).gc_content()
assert target == result, '50.0 for ATGC expected'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Тут везде тоже по 1 пустой строке достаточно)

Comment thread test_module.py

class TestAminoAcidSequences():
def test_aminoacids_alphabet(self):
peptide = AminoAcidSequence('')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Тут кажется что-то пошло не так?

Comment thread test_module.py
assert target == result


def test_zero_peptid_weight(self):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_zero_peptid_weight(self):
def test_zero_peptide_weight(self):

Comment thread test_module.py
assert target == result, 'expected 639.81 for model peptide VALINE'

def test_error_message_for_bad_seq(self):
bad_peptide = 'XAXAXA'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

АХАХАХ

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants