https://rosalind.info/problems/cstr/
ROSALIND | Creating a Character Table from Genetic Strings
It appears that your browser has JavaScript disabled. Rosalind requires your browser to be JavaScript enabled. Creating a Character Table from Genetic Strings solved by 432 2012년 8월 21일 12:00:00 오전 by Rosalind Team Topics: Phylogeny Phylogeny fro
rosalind.info
Problem
A collection of strings is characterizable if there are at most two possible choices for the symbol at each position of the strings.
Given: A collection of at most 100 characterizable DNA strings, each of length at most 300 bp.
Return: A character table for which each nontrivial character encodes the symbol choice at a single position of the strings. (Note: the choice of assigning '1' and '0' to the two states of each SNP in the strings is arbitrary.)
Sample Dataset
ATGCTACC
CGTTTACC
ATTCGACC
AGTCTCCC
CGTCTATC
Sample Output
10110
10100
이 문제는 SNP를 기준으로 캐릭터라이징을 하는 문제이다. 단순하게 첫번째 문자열을 기준으로 문자가 같으면 1 아니면 0인 식으로 하면 된다.
def make_ch_table(seqs):
ch_table=set()
for idx,nuc in enumerate(seqs[0]):
ch_array = [int(nuc==seq[idx]) for seq in seqs]
if 1<sum(ch_array)<len(ch_array)-1:
ch_table.add(''.join(map(str,ch_array)))
return ch_table
if __name__ == '__main__':
with open(r"파일경로",'r') as f:
seqs=[]
for i in f.readlines():
seqs.append(i.rstrip())
ch_table = make_ch_table(seqs)
with open(r"파일경로",'w') as wf:
wf.write('\n'.join(ch_table))
'문제해결(PS) > ROSALIND' 카테고리의 다른 글
Counting Unrooted Binary Trees (2) | 2024.11.17 |
---|---|
Counting Optimal Alignments (0) | 2024.11.16 |
Counting Disease Carriers (2) | 2024.11.10 |
Wobble Bonding and RNA Secondary Structures (0) | 2024.11.09 |
Newick Format with Edge Weights (1) | 2024.11.09 |