문제해결(PS)/ROSALIND

Matching Random Motifs

곰탱이장 2024. 9. 16. 11:27

https://rosalind.info/problems/rstr/

 

ROSALIND | Matching Random Motifs

It appears that your browser has JavaScript disabled. Rosalind requires your browser to be JavaScript enabled. Matching Random Motifs solved by 1973 2012년 12월 4일 7:06:55 오전 by Rosalind Team Topics: Probability More Random Strings In “Introducti

rosalind.info

Problem

Our aim in this problem is to determine the probability with which a given motif (a known promoter, say) occurs in a randomly constructed genome. Unfortunately, finding this probability is tricky; instead of forming a long genome, we will form a large collection of smaller random strings having the same length as the motif; these smaller strings represent the genome's substrings, which we can then test against our motif.

Given a probabilistic event AA, the complement of AA is the collection AcAc of outcomes not belonging to AA. Because AcAc takes place precisely when AA does not, we may also call AcAc "not AA."

For a simple example, if AA is the event that a rolled die is 2 or 4, then Pr(A)=13Pr(A)=13. AcAc is the event that the die is 1, 3, 5, or 6, and Pr(Ac)=23Pr(Ac)=23. In general, for any event we will have the identity that Pr(A)+Pr(Ac)=1Pr(A)+Pr(Ac)=1.

Given: A positive integer N100000N≤100000, a number xx between 0 and 1, and a DNA string ss of length at most 10 bp.

Return: The probability that if NN random DNA strings having the same length as ss are constructed with GC-content xx (see “Introduction to Random Strings”), then at least one of the strings equals ss. We allow for the same random string to be created more than once.

Sample Dataset

90000 0.6
ATAGCCGA

Sample Output

0.689

 

 이 문제는 길이과 GC비율이 주어지고, 그러한 서열에 2번째 줄에 주어진 서열이 하나라도 있을 확률을 구하는 문제이다. 이 문제는 간단한 여확률을 이용하여 모든 서열에 2번째 줄의 타겟 서열이 없을 확률을 구하면 된다.

N,x = input().split()
N=int(N)
x=float(x)
s=input()

at= s.count('A') + s.count('T')
gc = s.count('G') + s.count('C')

P_s = pow((x/2),gc) * pow(((1-x)/2),at)


print(1-pow((1-P_s),N))