Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Welcome to
Introduction to Bioinformatics
Wednesday, 19 September 2014
Scenario 2: Regulatory protein / Simulations
Array indexing
List notation
Push/Pop Shift/Unshift
foreach / for
Array-based DiceRoll.pl
Implementing a Simulation
General Strategy
Problem: How frequent are NtcA binding sites
in a random DNA sequence?
Which random sequence?
Differentiation in cyanobacteria
Find primers to PCR out hetC
ttgtcagttgtcagacgtagtagcgcgtctagtctaatgtgttgttatat
tatttgctactagaaatgaggagagggttatttttctcactgcttcccaa
ttctatgagaatataaaattttccttaagtttctcatggcaataatggaa
aaaaccgaccattctgatgaataagtccggttttttccaaaaaatatttt
tgctttttcgctttatttatctatatttccaagttttagtacatcggtga
ggggtgacaactatcttgccaatattgtcgttattgttaggttgctatcg
gaaaaaatcTGTAacatgagaTACAcaatagcatttatatttgctttagt
atctctctcttgggtgggattctgcctgcaatttaaaaaccagtgttaac
aattttcggctttattttccgggagttaaatcaaccaagggaaaatgtaa
ctaatgtttaaatatcttcggatacacacaaagtaaaaccaatttttaca
GTA…(8)…TAC
gatgtcgatgttgctcacattttttagaaatattactaaattaaaaatgt
tattaaatttatgttcatagagaaccttttccaaataaaaaaataatttt
cctgatgttttaagaaaattactgttgttataaattaaaggtgattcaac
aaaatatagatagttctttcaataactatctacttttaccattaagtgaa
cttactcatgaataatcaacaggaattaaaaataaagttcatgaatactg
gttaaagattcagtaaagtttgaggaaataccggaataaatttccaccca
aatatgattttttaaaagatacattggcagtacattaaaatgccgatgtt
Differentiation in cyanobacteria
ttctatgagaatataaaattttccttaagtttct
aaaaccgaccattctgatgaataagtccggtttt
tgctttttcgctttatttatctatatttccaagt
ggggtgacaactatcttgccaatattgtcgttat
gaaaaaatctGTAacatgagaTACacaatagcat
ttatatttgcttTAgtaTctctctcttgggtggg
GTA…(8)…TAC…(20-24)…TAnnnT
Promoter
NtcA binding site
Implementing a Simulation
General Strategy
Problem: How frequent are NtcA binding sites
in a random DNA sequence?
Which random sequence?
I still don't entirely understand
why we only need to create 847 bp
Implementing a Simulation
General Strategy
Problem: How frequent are NtcA binding sites
in random DNA sequence?
Strategy: Modify DiceRoll.pl
- (change to use arrays)
- Modify Make_random_sequence (SQ.1-3)
- Change Random_integer  Random nucleotide (SQ. 4)
- Modify Any_matches, test for exact match (SQ. 5)
- Modify Any_matches, allow inexact matches (SQ. 7-11)
The Simulation
The Alternative: Straight Math
SQ1. Probability of getting at least one matched pair
i don't remember combination
in a roll of five dice.
and permutation math very well.
Probability (0 dice matching) =
Probability (1 dice matching) =
Probability (2 dice matching) =
Probability (3 dice matching) =
Probability (4 dice matching) =
Probability (5 dice matching) =
Roll that doesn’t work
Roll that works
The Alternative: Straight Math
SQ1. Probability of getting at least one matched pair
in a roll of five dice.
Probability (0 dice matching) =
Probability (1 dice matching) =
Probability (2 dice matching) =
Probability (3 dice matching) =
Probability (4 dice matching) =
Probability (5 dice matching) =
Roll that doesn’t work
Roll that works
SQ3: push / pop / shift / unshift
I am just learning about
push, pop, shift, and
unshift in 600. A quick
review of all of these
would greatly help.
SQ3: push / pop / shift / unshift
YGRP
Arrays: Assignment and Access
@codons =
ATG GAT GCT TAT TTT CAA
0
Memory: 3200
1
2
3203
3206
Which $codon[
3
4
...
TAA
5
n
Memory: ????
] is GCT?
Where is $codon[n]?
3200 + 3*n
Where is $codon[1]?
3200 + 3
Where is $codon[2]?
3200 + 6
Arrays: Assignment and Access
Scalar assignment of array values:
my @days;
$days[0] = “Sun”;
$days[1] = “Mon”;
...
Array assignment of array values:
my @days = (“Sun”, “Mon”, ...);
my @numbers = (1 .. 47);
print @numbers;
Arrays: Assignment and Access
SQ2. @letters contains all uppercase letters.
How to print the letter "J"?
my @letters =
print
SQ3: push / pop / shift / unshift
SQ3. Predict output of:
@protein = ("cytochrome oxidase","hexokinase","glutamine synthetase");
push @protein, "phosphofructokinase", "albumin";
$protein[1]= "deleted";
unshift @protein, "globin";
$name1 = pop @protein;
$name2 = shift @protein;
$name3 = shift @protein;
print"name1 = $name1 name2 = $name2 name3 = $name3", $LF;
print"current protein[2] = $protein[2]", $LF;
print"remaining names: ",join(", ", @protein);
SQ4: DiceRoll if with arrays
SQ4. Rewrite these lines to use an array
if ($number_of_ones>=$matches_wanted) { return $true}
if ($number_of_twos>=$matches_wanted) { return $true}
...
if ($number_of_sixes>=$matches_wanted) { return $true}
for loops
Problem: Add up the numbers from 1 to 100
- Where to begin?
- Where to end?
- How to get from here to there?
- What to do in between?
for (my $number = 1;
$number <= 100;
$number = $number + 1) {
$sum = $sum + $number;
}
foreach loops
Problem: Add up the numbers from 1 to 100
- Where to begin?
- Where to end?
- How to get from here to there?
- What to do in between?
foreach (my $number (1 .. 100) {
$sum = $sum + $number;
}
foreach loops
SQ5. Write a loop that prints out a table of
numbers from 1 to 20 and their squares.
SQ6: Rewrite DiceRoll
SQ6. Replace $number_of_ones and similar variables
with an array.
Related documents