『Inferring Phylogenies』 - leeswijzer: een nieuwe leeszaal van dagboek

Joseph Felsenstein

（2004年刊行［実際には2003年夏に出た］，Sinauer Associates，Sunderland, xx+664 pp., ISBN:0878931775 [pbk]）

前に紹介したものの，目次は概略だけだったので，あらためて詳細目次をば．

【目次】
Preface xix
1. Parsimony methods 1
A simple example 1
　Evaluating a particular tree 1
　Rootedness and unrootedness 4
Methods of rooting the tree 6
Branch lengths 8
Unresolved questions 9
2. Counting evolutionary changes 11
The Fitch algorithm 11
The Sankoff algorithm 13
　Connection between the two algorithms 16
Using the algorithms when modifying trees 16
　Views 16
　Using views when a tree is altered 17
Further economies 18
3. How many trees are there? 19
Rooted bifurcating trees 20
Unrooted bifurcating trees 24
Multifurcating trees 25
　Unrooted trees with multifurcations 28
Tree shapes 28
　Rooted bifurcating tree shapes 29
　Rooted multifurcating tree shapes 30
　Unrooted Shapes 32
Labeled histories 35
Perspective 36
4. Finding the best tree by heuristic search 37
Nearest-neighbor interchanges 38
Subtree pruning and regrafting 41
Tree bisection and reconnection 44
Other tree rearrangement methods 44
　Tree-fusing 44
　Genetic algorithms 44
　Tree windows and sectorial search 46
Speeding up rearrangements 46
Sequential addition 47
Star decomposition 48
Tree space 48
Search by reweighting of characters 51
Simulated annealing 52
History 53
5. Finding the best tree by branch and bound 54
A nonbiological example 54
Finding the optimal solution 57
NP-hardness 57
Branch and bound methods 60
Phylogenies: Despair and hope 60
Branch and bound for parsimony 61
Improving the bound 64
　Using still-absent states 64
　Using compatibility 64
Rules limiting the search 65
6. Ancestral states and branch lengths 67
Reconstructing ancestral states 67
Accelerated and delayed transformation 70
Branch lengths 70
7. Variants of parsimony 73
Camin-Sokal parsimony 73
Parsimony on an ordinal scale 74
Dollo parsimony 75
Polymorphism parsimony 76
Unknown ancestral states 78
Multiple states and binary coding 78
Dollo parsimony and multiple states 80
Polymorphism parsimony and multiple states 81
Transformation series analysis 81
Weighting characters 82
Successive weighting and nonlinear weighting 83
　Successive weighting 83
　Nonsuccessive algorithms 84
8. Compatibility 87
Testing compatibility 88
The Pairwise Compatibility Theorem 89
Cliques of compatible characters 91
Finding the tree from the clique 92
Other cases where cliques can be used 94
Where cliques cannot be used 94
　Perfect phylogeny 95
　Using compatibility on molecules anyway 95
9. Statistical properties of parsimony 97
Likelihood and parsimony 97
　The weights 100
　Unweighted parsimony 100
　Limitations of this justification of parsimony 101
　Farris’s proofs 102
　No common mechanism 103
　Likelihood and compatibility 105
　Parsimony versus compatibility 107
Consistency and parsimony 107
　Character patterns and parsimony 107
　Observed numbers of the patterns 110
　Observed fractions of the patterns 110
　Expected fractions of the patterns 111
　Inconsistency 113
　When inconsistency is not a problem 114
　The nucleotide sequence case 115
　Other situations where consistency is guaranteed 117
　Does a molecular clock guarantee consistency? 118
　The Farris zone 120
Some perspective 121
10. A digression on history and philosophy 123
How phylogeny algorithms developed 123
　Sokal and Sneath 123
　Edwards and Cavalli-Sforza 125
　Camin and Sokal and parsimony 128
　Eck and Dayhoff and molecular parsimony 130
　Fitch and Margoliash popularize distance matrix methods 131
　Wilson and Le Quesne introduce compatibility 133
　Jukes and Cantor and molecular distances 134
　Farris and Kluge and unordered parsimony 134
　Fitch and molecular parsimony 136
　Further work 136
　What about Willi Hennig and Walter Zimmerman? 136
Different philosophical frameworks 138
　Hypothetico-deductive 138
　Logical parsimony 140
　Logical probability? 142
　Criticisms of statistical inference 143
　The irrelevance of classification 145
11. Distance matrix methods 147
Branch lengths and times 147
The least squares methods 148
　Least squares branch lengths 148
　Finding the least squares tree topology 153
The statistical rationale 153
Generalized least squares 154
Distances 155
The Jukes-Cantor model―-an example 156
Why correct for multiple changes? 158
Minimum evolution 159
Clustering algorithms 161
UPGMA and least squares 161
　A clustering algorithm 162
　An example 162
　UPGMA on nonclocklike trees 165
Neighbor-joining 166
　Performance 168
　Using neighbor-joining with other methods 169
　Relation of neighbor-joining to least squares 169
　Weighted versions of neighbor-joining 170
Other approximate distance methods 171
　Distance Wagner method 171
　A related family 171
　Minimizing the maximum discrepancy 172
　Two approaches to error in trees 172
A puzzling formula 174
Consistency and distance methods 174
A limitation of distance methods 175
12. Quartets of species 176
The four point metric 177
The split decomposition 178
　Related methods 182
　Short quartets methods 182
The disk-covering method 183
Challenges for the short quartets and DCM methods 185
Three-taxon statement methods 186
Other uses of quartets with parsimony 188
Consensus supertrees 189
Neighborliness 191
De Soete’s search method 192
Quartet puzzling and searching tree space 193
Perspective 194
13. Models of DNA evolution 196
Kimura’s two-parameter model 196
Calculation of the distance 198
The Tamura-Nei model, F84, and HKY 200
The general time-reversible model 204
　Distances from the GTR model 206
The general 12-parameter model 210
LogDet distances 211
Other distances 213
Variance of distance 214
Rate variation between sites or loci 215
　Different rates at different sites 215
　Distances with known rates 216
　Distribution of rates 216
　Gamma- and lognormally distributed rates 217
　Distances from gamma-distributed rates 217
Models with nonindependence of sites 221
14. Models of protein evolution 222
Amino acid models 222
The Dayhoff model 222
Other empirically-based models 223
　Models depending on secondary structure 225
Codon-based models 225
　Inequality of synonymous and nonsynonymous substitutions 227
Protein structure and correlated change 228
15. Restriction sites, RAPDs, AFLPs, and microsatellites 230
Restriction sites 230
　Nei and Tajima’s model 230
　Distances based on restriction sites 233
　Issues of ascertainment 234
　Parsimony for restriction sites 235
Modeling restriction fragments 236
　Parsimony with restriction fragments 239
RAPDs and AFLPs 239
　The issue of dominance 240
　Unresolved problems 240
Microsatellite models 241
　The one-step model 241
　Microsatellite distances 242
　A Brownian motion approximation 244
　Models with constraints on array size 246
　Multi-step and heterogeneous models 246
　Snakes and Ladders 246
　Complications 247
16. Likelihood methods 248
Maximum likelihood 248
　An example 249
Computing the likelihood of a tree 251
　Economizing on the computation 253
　Handling ambiguity and error 255
Unrootedness 256
Finding the maximum likelihood tree 256
Inferring ancestral sequences 259
Rates varying among sites 260
　Hidden Markov models 262
　Autocorrelation of rates 264
　HMMs for other aspects of models 265
　Estimating the states 265
Models with clocks 266
　Relaxing molecular clocks 266
　Models for relaxed clocks 267
　Covarions 268
　Empirical approaches to change of rates 269
Are ML estimates consistent? 269
　Comparability of likelihoods 270
　A nonexistent proof? 270
　A simple proof 271
　Misbehavior with the wrong model 272
　Better behavior with the wrong model 274
17. Hadamard methods 275
The edge length spectrum and conjugate spectrum 279
The closest tree criterion 281
DNA models 284
Computational effort 285
Extensions of Hadamard methods 286
18. Bayesian inference of phylogenies 288
Bayes’ theorem 288
Bayesian methods for phylogenies 289
Markov chain Monte Carlo methods 292
The Metropolis algorithm 292
　Its equilibrium distribution 293
　Bayesian MCMC 294
Bayesian MCMC for phylogenies 295
　Priors 295
Proposal distributions 296
Computing the likelihoods 298
Summarizing the posterior 299
Priors on trees 300
Controversies over Bayesian inference 301
　Universality of the prior 301
　Flat priors and doubts about them 301
Applications of Bayesian methods 304
19. Testing models, trees, and clocks 307
Likelihood and tests 307
Likelihood ratios near asymptopia 308
Multiple parameters 309
　Some parameters constrained, some not 310
　Conditions 310
　Curvature or height? 311
Interval estimates 311
Testing assertions about parameters 311
　Coins in a barrel 313
　Evolutionary rates instead of coins 314
Choosing among nonnested hypotheses: AIC and BIC 315
　An example using the AIC criterion 317
The problem of multiple topologies 318
　LRTs and single branches 319
Interior branch tests 320
　Interior branch tests using parsimony 321
　A multiple-branch counterpart of interior branch tests 322
Testing the molecular clock 322
　Parsimony-based methods 322
　Distance-based methods 323
　Likelihood-based methods 323
　The relative rate test 324
Simulation tests based on likelihood 328
　Further literature 329
More exact tests and confidence intervals 329
　Tests for three species with a clock 329
　Bremer support 330
　Zander’s conditional probability of reconstruction 331
　More generalized confidence sets 332
20. Bootstrap, jackknife, and permutation tests 335
The bootstrap and the jackknife 335
Bootstrapping and phylogenies 337
The delete-half jackknife 339
The bootstrap and jackknife for phylogenies 340
The multiple-tests problem 342
Independence of characters 342
Identical distribution —— a problem? 343
Invariant characters and resampling methods 344
Biases in bootstrap and jackknife probabilities 346
　P values in a simple normal case 349
　Methods of reducing the bias 352
　The drug testing analogy 355
Alternatives to P values 356
　Probabilities of trees 357
　Using tree distances 357
　Jackknifing species 358
Parametric bootstrapping 358
　Advantages and disadvantages of the parametric bootstrap 358
Permutation tests 358
　Permuting species within characters 359
　Permuting characters 361
　Skewness of tree length distribution 362
21. Paired-sites tests 364
　An example 365
Multiple trees 369
　The SH test 369
　Other multiple-comparison tests 371
Testing other parameters 372
Perspective 372
22. Invariants 373
Symmetry invariants 374
Three-species invariants 376
Lake’s linear invariants 378
Cavender’s quadratic invariants 380
　The K invariants 380
　The L invariants 381
　Generalization of Cavender’s L invariants 382
Drolet and Sankoff’s k-state quadratic invariants 385
Clock invariants 385
General methods for finding invariants 386
　Fourier transform methods 386
　Gröbner bases and other general methods 387
　Expressions for all the 3ST invariants 387
　Finding all invariants empirically 387
　All linear invariants 388
　Special cases and extensions 389
Invariants and evolutionary rates 389
Testing invariants 389
What use are invariants? 390
23. Brownian motion and gene frequencies 391
Brownian motion 391
Likelihood for a phylogeny 392
What likelihood to compute? 395
　Assuming a clock 399
　The REML approach 400
Multiple characters and Kronecker products 402
Pruning the likelihood 404
Maximizing the likelihood 406
Inferring ancestral states 408
　Squared-change parsimony 409
Gene frequencies and Brownian motion 410
　Using approximate Brownian motion 411
　Distances from gene frequencies 412
　A more exact likelihood method 413
　Gene frequency parsimony 413
24. Quantitative characters 415
Neutral models of quantitative characters 416
Changes due to natural selection 419
　Selective correlation 419
　Covariances of multiple characters in multiple lineages 420
　Selection for an optimum 420
　Brownian motion and selection 422
Correcting for correlations 422
Punctuational models 424
Inferring phylogenies and correlations 425
Chasing a common optimum 426
The character-coding “problem” 426
Continuous-character parsimony methods 428
　Manhattan metric parsimony 428
　Other parsimony methods 429
Threshold models 429
25. Comparative methods 432
An example with discrete states 432
An example with continuous characters 433
The contrasts method 435
Correlations between characters 436
When the tree is not completely known 437
Inferring change in a branch 438
Sampling error 439
The standard regression and other variations 442
　Generalized least squares 442
　Phylogenetic autocorrelation 442
　Transformations of time 442
　Should we use the phylogeny at all? 443
Paired-lineage tests 443
Discrete characters 444
　Ridley’s method 444
　Concentrated-changes tests 445
　A paired-lineages test 446
　Methods using likelihood 446
　Advantages of the likelihood approach 448
Molecular applications 448
26. Coalescent trees 450
Kingman’s coalescent 454
Bugs in a box―an analogy 460
Effect of varying population size 460
Migration 461
Effect of recombination 464
Coalescents and natural selection 467
　Neuhauser and Krone’s method 468
27. Likelihood calculations on coalescents 470
The basic equation 470
Using accurate genealogies―a reverie 471
Two random sampling methods 473
　A Metropolis-Hastings method 473
　Griffiths and Tavaré’s method 476
Bayesian methods 482
　MCMC for a variety of coalescent models 482
Single-tree methods 484
　Slatkin and Maddison’s method 484
　Fu’s method 484
Summary-statistic methods 485
　Watterson’s method 485
　Other summary-statistic methods 486
　Testing for recombination 486
28. Coalescents and species trees 488
Methods of inferring the species phylogeny 490
　Reconciled tree parsimony approaches 492
　Likelihood 493
29. Alignment, gene families, and genomics 496
Alignment 497
　Why phylogenies are important 497
Parsimony method 497
　Approximations and progressive alignment 500
Probabilistic models 502
　Bishop and Thompson’s method 502
　The minimum message length method 502
　The TKF model 503
　Multibase insertions and deletions 506
　Tree HMMs 507
　Trees 507
　Inferring the alignment 509
Gene families 509
　Reconciled trees 509
　Reconstructing duplications 511
　Rooting unrooted trees 512
　A likelihood analysis 514
Comparative genomics 515
　Tandemly repeated genes 515
　Inversions 516
　Inversions in trees 516
　Inversions, transpositions, and translocations 516
　Breakpoint and neighbor-coding approximations 517
　Synteny 517
　Probabilistic models 518
Genome signature methods 519
30. Consensus trees and distances between trees 521
Consensus trees 521
　Strict consensus 521
　Majority-rule consensus 523
　Adams consensus tree 524
A dismaying result 525
　Consensus using branch lengths 526
　Other consensus tree methods 526
　Consensus subtrees 528
Distances between trees 528
　The symmetric difference 528
　The quartets distance 530
　The nearest-neighbor interchange distance 530
　The path-length-difference metric 531
　Distances using branch lengths 531
　Are these distances truly distances? 533
　Consensus trees and distances 534
　Trees significantly the same? different? 534
What do consensus trees and tree distances tell us? 535
　The total evidence debate 536
　A modest proposal 537
31. Biogeography, hosts, and parasites 539
Component compatibility 540
Brooks parsimony 541
Event-based parsimony methods 543
　Relation to tree reconciliation 545
Randomization tests 545
Statistical inference 546
32. Phylogenies and paleontology 547
Stratigraphic indices 548
Stratophenetics 549
Stratocladistics 549
Controversies 552
A not-quite-likelihood method 553
Stratolikelihood 553
　Making a full likelihood method 554
　More realistic fossilization models 554
Fossils within species: Sequential sampling 555
Between species 555
33. Tests based on tree shape 559
Using the topology only 559
　Imbalance at the root 560
Harding’s probabilities of tree shapes 561
Tests from shapes 562
　Measures of overall asymmetry 563
　Choosing a powerful test 564
Tests using times 564
　Lineage plots 565
　Likelihood formulas 567
　Other likelihood approaches 569
　Other statistical approaches 569
　A time transformation 570
Characters and key innovations 571
Work remaining 571
34. Drawing trees 573
Issues in drawing rooted trees 574
　Placement of interior nodes 574
　Shapes of lineages 576
Unrooted trees 578
　The equal-angle algorithm 578
　n-Body algorithms 580
　The equal-daylight algorithm 582
Challenges 584
35. Phylogeny software 585
Trees, records, and pointers 585
Declaring records 586
Traversing the tree 587
Unrooted tree data structures 589
Tree file formats 590
Widely used phylogeny programs and packages 591

References 595
Index 644