Tuesday, January 17, 2012

What I read this week

CHALK proposal

Li H, Homer N. A survey of sequence alignment algorithms for next-generation sequencing. Brief Bioinform 2010;11:473-483.

- hash tables, spaced seed, q-gram filter, multiple seed hits, suffix/prefix tries, seed extension, suffix trie, FM-index, inexact matches, gapped alignment,

- Aligning sequence reads: gapped alignment, paired-end and mate-pair mapping, base quality, long-read aligner, capillary read aligner, SOLiD reads, bisulfite-treated reads, spliced reads, realignment

- Speed, memory considerations

Revital Eres, Gad M Landau, Laxmi Parida. Permutation pattern discovery in biosequences.Journal of computational biology a journal of computational molecular cell biology 2004 11 (6) p. 1050-1060

- sliding window technique for pattern matching with examples

Smith TF, Waterman MS. Indentification of common molecular subsequences. J Mol Biol 1981;147:195-7.

- Similarity (homology) measure

Smith TF, Waterman MS. Comparison of Biosequences. Advances in Applied Mathematics 1981;489:482-489.

- More detailed algorithm for the Smith-Waterman homology measure, comparison to Sellers and Needleman and Wunsch algorithms

MacIsaac KD, Fraenkel E (2006) Practical strategies for discovering regulatory DNA sequence motifs. PLoS Comput Biol 2(4): e36. DOI: 10.1371/journal.pcbi0020036

- DNA Encoding Schemes with examples

o consensus sequence of preferred nucleotides (ACGT)

o position weight matrix (PWM)

o example: seq to pwm 6 positions

- Clustering of DNA - techniques, dimensionality

o k-medroids

o SOM,

o hierarchical clustering to the motifs and combined clusters with a similarity exceeding 70% by computing a consensus sequence

- Distance / similarity measure for DNA sequences

o fraction of common bits as a similarity metric

o Pearson correlation coefficient between motif PWMs as the similarity measure

Next week:

I need to understand basics of DNA, Computational Chemistry, and Bioinformatics better. I have a few book chapters downloaded from the WVU libraries electronic collection.

