문제해결(PS)/ROSALIND

Transitions and Transversions

곰탱이장 2024. 8. 31. 20:15

https://rosalind.info/problems/tran/

 

ROSALIND | Transitions and Transversions

It appears that your browser has JavaScript disabled. Rosalind requires your browser to be JavaScript enabled. Transitions and Transversions solved by 5299 2012년 12월 4일 7:05:22 오전 by Rosalind Team Topics: Alignment Certain Point Mutations are Mor

rosalind.info

Problem

For DNA strings s1s1 and s2s2 having the same length, their transition/transversion ratio R(s1,s2)R(s1,s2) is the ratio of the total number of transitions to the total number of transversions, where symbol substitutions are inferred from mismatched corresponding symbols as when calculating Hamming distance (see “Counting Point Mutations”).

Given: Two DNA strings s1s1 and s2s2 of equal length (at most 1 kbp).

Return: The transition/transversion ratio R(s1,s2)R(s1,s2).

Sample Dataset

>Rosalind_0209
GCAACGCACAACGAAAACCCTTAGGGACTGGATTATTTCGTGATCGTTGTAGTTATTGGA
AGTACGGGCATCAACCCAGTT
>Rosalind_2200
TTATCTGACAAAGAAAGCCGTCAACGGCTGGATAATTTCGCGATCGTGCTGGTTACTGGC
GGTACGAGTGTTCCTTTGGGT

Sample Output

1.21428571429

 

 이 문제는 두 서열간의 transition/transversion 비율을 출력하는 문제이다. transition은 A와 G끼리(퓨린간) 변화나 T와 C끼리(피리미딘) 변화를 나타내고 transversion은 그 외의 변화를 이야기한다. 더 자세하 이야기는 https://incodom.kr/TsTv 이 사이트에 잘 나와있다.

 개념만 안다면 문제는 단순한 for 루프를 돌리는 문제가 된다.

from Bio import SeqIO

if __name__=='__main__':
    with open(r'파일경로','r') as fast:
        fa=SeqIO.parse(fast,'fasta')
        s1=str(next(fa).seq)
        s2=str(next(fa).seq)


trans_dict={'A':'G','G':'A','C':'T','T':'C'}#transtion 경우
transitions=0
transversions=0
for ind in range(len(s1)):
    if trans_dict[s1[ind]] == s2[ind]:#transtion이라면
        transitions+=1
    elif s1[ind] != s2[ind]:#transversion이라면
        transversions+=1


print(transitions/transversions)