You can download all of our solutions as lab04.py
What percentage of codons across all primary reading frames are ATG?
If this were completely random, we'd expect 1/64=1.5625% of the triples.
We observe 1.196% for guinea pig and 0.978% for human.
count = 0
for k in range(len(dna)-2): # N.B. stop value
if dna[k:k+3] == 'ATG':
count += 1
percent = count/(len(dna)-2) * 100
If two consecutive nucleotides match each other, how often is
the next nucleotide that same nucleotide?
If nucleotides were completely random, we’d expect 25%;
We observe 28.392% in guinea pig and 30.620% in human.
doubles = 0
triples = 0
for k in range(len(dna)-2): # N.B. stop value
if dna[k] == dna[k+1]: # neighbors match
doubles += 1
if dna[k] == dna[k+2]: # the third of the triple matches as well
triples += 1
percent = 100*triples/doubles
How many times does a motif of the form CC?AT occur within the
sequence? (where ? could be anything)
For guinea pig, 111 times; for humans, 132 times.
total = 0
for k in range(len(dna)-4): # N.B. stop value
if dna[k:k+2] == 'CC' and dna[k+3:k+5] == 'AT':
total += 1
When the motif CC?AT does occur, what percentage of the
time is the middle nucleotide an A? (A so-called cat box CCAAT)
For guinea pig, 27.027%; for humans, 21.212%.
motifs = 0
catbox = 0
for k in range(len(dna)-4): # N.B. stop value
if dna[k:k+2] == 'CC' and dna[k+3:k+5] == 'AT':
motifs += 1
if dna[k+2] == 'A':
catbox += 1
percent = 100*catbox/motifs
The pattern CCAAT is known as a "cat" box. What are the
relative percentage of bases immediately following the pattern
CCAA in the dna?
| Guinea Pig | |||
| A: 28.431% | C: 31.373% | G: 10.784% | T: 29.412% |
| Human | |||
| A: 39.416% | C: 29.197% | G: 10.949% | T: 20.438% |
ca = 0 # count for A
cc = 0 # count for C
cg = 0 # count for G
ct = 0 # count for T
for k in range(len(dna)-4): # N.B. stop value
if dna[k:k+4] == 'CCAA':
if dna[k+4] == 'A':
ca += 1
elif dna[k+4] == 'C':
cc += 1
elif dna[k+4] == 'G':
cg += 1
elif dna[k+4] == 'T':
ct += 1
total = ca+cc+cg+ct
# ... can then display ca/total, cc/total, and so on