MPS-RR 2004-29
December 2004
We consider Markov processes for coding DNA sequence
evolution. In context dependent models the instantaneous
substitution rate for a codon depends on the neighboring codons.
This makes the model analytically intractable, and
previously Markov chain
Monte Carlo methods have been used for statistical inference. We introduce an
approximative estimation method based on pseudo-likelihood that
makes inference analytically tractable.
We demonstrate that the pseudo-likelihood estimates are very
accurate, and from analyzing 348 human-mouse coding sequences we conclude that
incorporating the CpG
effect improves the model fits considerably.
Availability: [ gzipped ps
-file ] [ pdf
-file ]