配列情報解析の3人の専門家、浅井潔、加藤和貴、Martin Frith、が講義する。
Course content:
Critical thinking versus cargo cult science, and dangers of jargon.
Diversity of genetic sequence data.
How genetic sequences mutate and evolve.
Evolutionary relationships: homology, orthology, and paralogy.
Repeats: transposable elements and tandem repeats. Facts versus definitions.
Finding clumps in sequential data, by dynamic programming. Facts versus definitions. Finding one optimal clump: Kadane's algorithm for one sequence, Smith-Waterman algorithm for comparing two sequences. Problems and solutions when there is more than one clump.
Probability models: all models are wrong, but some are useful.
Similar sequences occurring by chance: p-values and E-values, multiple testing.
Multiple sequence alignment: aligning more than two sequences to each other.
Fast algorithms: seed-and-extend alignment versus alignment-free sequence comparison. Spaced seeds and subset seeds. Suffix arrays. Sampling a subset of positions in a sequence.