https://rosalind.info/problems/dbru/
ROSALIND | Constructing a De Bruijn Graph
It appears that your browser has JavaScript disabled. Rosalind requires your browser to be JavaScript enabled. Constructing a De Bruijn Graph solved by 1148 2012년 7월 2일 12:00:00 오전 by Mikhail Dvorkin Topics: Genome Assembly Wading Through the Rea
rosalind.info
Problem
Consider a set SS of (k+1)(k+1)-mers of some unknown DNA string. Let SrcSrc denote the set containing all reverse complements of the elements of SS. (recall from “Counting Subsets” that sets are not allowed to contain duplicate elements).
The de Bruijn graph BkBk of order kk corresponding to S∪SrcS∪Src is a digraph defined in the following way:
- Nodes of BkBk correspond to all kk-mers that are present as a substring of a (k+1)(k+1)-mer from S∪SrcS∪Src.
- Edges of BkBk are encoded by the (k+1)(k+1)-mers of S∪SrcS∪Src in the following way: for each (k+1)(k+1)-mer rr in S∪SrcS∪Src, form a directed edge (r[1:k]r[1:k], r[2:k+1]r[2:k+1]).
Given: A collection of up to 1000 (possibly repeating) DNA strings of equal length (not exceeding 50 bp) corresponding to a set SS of (k+1)(k+1)-mers.
Return: The adjacency list corresponding to the de Bruijn graph corresponding to S∪SrcS∪Src.
Sample Dataset
TGAT
CATG
TCAT
ATGC
CATC
CATC
Sample Output
(ATC, TCA)
(ATG, TGA)
(ATG, TGC)
(CAT, ATC)
(CAT, ATG)
(GAT, ATG)
(GCA, CAT)
(TCA, CAT)
(TGA, GAT)
이 문제는 주어진 DNA와 이들의 역상보적 DNA의 k+1-mer 들의 de Brujin graph의 edge들을 출력하는 문제이다. de Brujin graph의 node는 (k+1)-mer DNA를 함유하는 set S와 그들의 역상보적인 DNA를 함유하는 set Src의 합집합의 요소로 만든 k-mer들이고 edge는 k+1-mer인 r에서 r[1:k] -> r[2:k+1]인 방향성이 있는 edge이다.
위의 개념을 그대로 차분히 코드로 구현해내면 쉽게 구할 수 있다.
from Bio.Seq import Seq
if __name__ == "__main__":
with open(r"파일경로",'r') as f:
seqs=set()
for i in f.readlines():
seqs.add(i.rstrip())
ri = Seq(i.rstrip())
seqs.add(str(ri.reverse_complement()))
anss=[]
for node in seqs:
anss.append((node[0:len(node)-1],node[1:len(node)]))
wf = open(r"파일경로",'w')
anss.sort()
for ans in anss:
print(f'({ans[0]}, {ans[1]})',file=wf)
'문제해결(PS) > ROSALIND' 카테고리의 다른 글
Independent Segregation of Chromosomes (1) | 2024.11.02 |
---|---|
Inferring Peptide from Full Spectrum (0) | 2024.11.02 |
Creating a Character Table (1) | 2024.10.19 |
Comparing Spectra with the Spectral Convolution (2) | 2024.10.17 |
Introduction to Pattern Matching (9) | 2024.10.16 |