문제해결(PS)/ROSALIND

Wobble Bonding and RNA Secondary Structures

곰탱이장 2024. 11. 9. 16:48

https://rosalind.info/problems/rnas/

 

ROSALIND | Wobble Bonding and RNA Secondary Structures

It appears that your browser has JavaScript disabled. Rosalind requires your browser to be JavaScript enabled. Wobble Bonding and RNA Secondary Structures solved by 654 Don't Look Down We have discussed the problem of counting RNA secondary structures in p

rosalind.info

Given an RNA string ss, we will augment the bonding graph of ss by adding basepair edges connecting all occurrences of 'U' to all occurrences of 'G' in order to represent possible wobble base pairs.

We say that a matching in the bonding graph for ss is valid if it is noncrossing (to prevent pseudoknots) and has the property that a basepair edge in the matching cannot connect symbols sjsj and sksk unless kj+4k≥j+4 (to prevent nearby nucleotides from base pairing).

See Figure 1 for an example of a valid matching if we allow wobble base pairs. In this problem, we will wish to count all possible valid matchings in a given bonding graph; see Figure 2 for all possible valid matchings in a small bonding graph, assuming that we allow wobble base pairing.

Given: An RNA string ss (of length at most 200 bp).

Return: The total number of distinct valid matchings of basepair edges in the bonding graph of ss. Assume that wobble base pairing is allowed.

Sample Dataset

AUGCUAGUACGGAGCGAGUCUAGCGAGCGAUGUCGUGAGUACUAUAUAUGCGCAUAAGCCACGU

Sample Output

284850219977421

 

  이 문제는 Motzkin Numbers and RNA Secondary Structures 에서 RNA의 2차 구조를 구했던 것에 4이상 거리가 떨어지고 wobble base pair(G와 U가 붙음)을 포함하는 조건만 추가해서 2차 구조의 가짓수를 나타내는 문제이다. 단순하게 저 코딩에서 조건만 좀 바꿔주면 끝이다.

from Bio import SeqIO

def finding_mot(s,node,mot_memo):
    if node<=1:
        return 1
    if mot_memo.get((s,node),0):
        return mot_memo[(s,node)]
    Cn=finding_mot(s[1:],node-1,mot_memo)

    for k in range(4,node):#4이상 거리 추가
        if (s[0],s[k]) in [('A','U'),('U','A'),('C','G'),('G','C'),('U','G'),('G','U')]:#wobble 추가
            Cn+=finding_mot(s[1:k],k-1,mot_memo)*finding_mot(s[k+1:len(s)],node-k-1,mot_memo)
    
    mot_memo[(s,node)] = Cn

    return Cn

if __name__=='__main__':
    with open(r'파일경로','r') as fa:
        S=next(fa).rstrip()

mot_memo={}
node=len(S)

mot=finding_mot(S,node,mot_memo)
print(mot)