문제해결(PS)/ROSALIND

Creating a Character Table from Genetic Strings

곰탱이장 2024. 11. 11. 19:42

https://rosalind.info/problems/cstr/

 

ROSALIND | Creating a Character Table from Genetic Strings

It appears that your browser has JavaScript disabled. Rosalind requires your browser to be JavaScript enabled. Creating a Character Table from Genetic Strings solved by 432 2012년 8월 21일 12:00:00 오전 by Rosalind Team Topics: Phylogeny Phylogeny fro

rosalind.info

 

Problem

A collection of strings is characterizable if there are at most two possible choices for the symbol at each position of the strings.

Given: A collection of at most 100 characterizable DNA strings, each of length at most 300 bp.

Return: A character table for which each nontrivial character encodes the symbol choice at a single position of the strings. (Note: the choice of assigning '1' and '0' to the two states of each SNP in the strings is arbitrary.)

Sample Dataset

ATGCTACC
CGTTTACC
ATTCGACC
AGTCTCCC
CGTCTATC

Sample Output

10110
10100

 

 이 문제는 SNP를 기준으로 캐릭터라이징을 하는 문제이다. 단순하게 첫번째 문자열을 기준으로 문자가 같으면 1 아니면 0인 식으로 하면 된다.

def make_ch_table(seqs):
    ch_table=set()

    for idx,nuc in enumerate(seqs[0]):
        ch_array = [int(nuc==seq[idx]) for seq in seqs]
        if 1<sum(ch_array)<len(ch_array)-1:
            ch_table.add(''.join(map(str,ch_array)))
    
    return ch_table

if __name__ == '__main__':
    with open(r"파일경로",'r') as f:
        seqs=[]
        for i in f.readlines():
            seqs.append(i.rstrip())

ch_table = make_ch_table(seqs)

with open(r"파일경로",'w') as wf:
    wf.write('\n'.join(ch_table))

'문제해결(PS) > ROSALIND' 카테고리의 다른 글

Counting Unrooted Binary Trees  (2) 2024.11.17
Counting Optimal Alignments  (0) 2024.11.16
Counting Disease Carriers  (2) 2024.11.10
Wobble Bonding and RNA Secondary Structures  (0) 2024.11.09
Newick Format with Edge Weights  (1) 2024.11.09