機器學習 : 貝葉斯和優化方法 (英文版)(Machine Learning: A Bayesian and Optimization Perspective)
內容描述
本書對所有主要的機器學習方法和新研究趨勢進行了深入探索,涵蓋概率和確定性方法以及貝葉斯推斷方法。其中,經典方法包括平均/小二乘濾波、卡爾曼濾波、隨機逼近和在線學習、貝葉斯分類、決策樹、邏輯回歸和提升方法等,新趨勢包括稀疏、凸分析與優化、在線分佈式算法、RKH空間學習、貝葉斯推斷、圖模型與隱馬爾可夫模型、粒子濾波、深度學習、字典學習和潛變數建模等。
全書構建了一套明晰的機器學習知識體系,各章內容相對獨立,物理推理、數學建模和算法實現精準且細緻,並輔以應用實例和習題。本書適合該領域的科研人員和工程師閲讀,也適合學習模式識別、統計/自適應信號處理和深度學習等課程的學生參考。
目錄大綱
Preface
Acknowledgments
Notation
CHAPTER 1 Introduction
1.1 What Machine Learning is About
1.1.1 Classification
1.1.2 Regression
1.2 Structure and a Road Map of the Book
References
CHAPTER 2 Probability and Stochastic Processes
2.1 Introduction
2.2 Probability and Random Variables
2.2.1Probability
2.2.2Discrete Random Variables
2.2.3Continuous Random Variables
2.2.4Meanand Variance
2.2.5Transformation of Random Variables
2.3 Examples of Distributions
2.3.1Discrete Variables
2.3.2Continuous Variables
2.4 Stochastic Processes
2.4.1First and Second Order Statistics
2.4.2Stationarity and Ergodicity
2.4.3PowerSpectral Density
2.4.4Autoregressive Models
2.5 Information Theory
2.5.1Discrete Random Variables
2.5.2Continuous Random Variables
2.6 Stochastic Convergence
Problems
References
CHAPTER 3 Learning in Parametric Modeling:Basic Concepts and Directions
3.1 Introduction
3.2 Parameter Estimation:The Deterministic Point of View
3.3 Linear Regression
3.4 Classification
3.5 Biased Versus Unbiased Estimation
3.5.1 Biased or Unbiased Estimation?
3.6 The Cramér—Rao Lower Bound
3.7 Sufcient Statistic
3.8 Regularization
3.9 The Bias—Variance Dilemma
3.9.1 Mean—Square Error Estimation
3.9.2 Bias—Variance Tradeoff
3.10 Maximum Likelihood Method
3.10.1 Linear Regression:The Nonwhite Gaussian Noise Case
3.11 Bayesian Inference
3.11.1 The Maximum a Posteriori Probability Estimation Method
3.12 Curse of Dimensionality
3.13 Validation
3.14 Expected and Empirical Loss Functions
3.15 Nonparametric Modeling and Estimation
Problems
References
CHAPTER 4 Mean—quare Error Linear Estimation
4.1Introduction
4.2Mean—Square Error Linear Estimation:The Normal Equations
4.2.1The Cost Function Surface
4.3A Geometric Viewpoint: Orthogonality Condition
4.4Extensionto Complex—Valued Variables
4.4.1Widely Linear Complex—Valued Estimation
4.4.2Optimizing with Respect to Complex—Valued Variables:Wirtinger Calculus
4.5Linear Filtering
4.6MSE Linear Filtering:A Frequency Domain Point of View
4.7Some Typical Applications
4.7.1Interference Cancellation
4.7 .2System Identification
4.7.3Deconvolution:Channel Equalization
4.8Algorithmic Aspects:The Levinson and the Lattice—Ladder Algorithms
4.8.1The Lattice—Ladder Scheme
4.9Mean—Square Error Estimation of Linear Models
4.9.1The Gauss—Markov Theorem
4.9.2Constrained Linear Estimation: The Beamforming Case
4.10Time—Varying Statistics:Kalman Filtering
Problems
References
CHAPTER 5 Stochastic Gradient Descent:The LMS Algorithm and its Family
5.1 Introduction
5.2 The Steepest Descent Method
5.3 Application to the Mean—Square Error Cost Function
5.3.1 The Complex—Valued Case
5.4 Stochastic Approximation
5.5 The Least—Mean—Squares Adaptive Algorithm
5.5.1 Convergence and Steady—State Performance of the LMS in Stationary Environments
5.5.2 Cumulative Loss Bounds
5.6 The Affine Projection Algorithm
5.6.1 The Normalized LMS
5.7 The Complex—Valued Case
5.8 Relatives of the LMS
5.9 Simulation Examples
5.10 Adaptive Decision Feedback Equalization
5.11 The Linearly Constrained LMS
5.12 Tracking Performance of the LMS in Nonstationary Environments
5.13 Distributed Learning:The Distributed LMS
5.13.1Cooperation Strategies
5.13.2The Diffusion LMS
5.13.3 Convergence and Steady—State Performance:Some Highlights
5.13.4 Consensus—Based Distributed Schemes
5.14 A Case Study: Target Localization
5.15 Some Concluding Remarks:Consensus Matrix
Problems
References
CHAPTER 6 The Least—Squares Family
6.1 Introduction
6.2 Least—Squares Linear Regression:A Geometric Perspective
6.3 Statistical Properties of the LS Estimator
6.4 Orthogonalizing the Column Space of X:The SVD Method
6.5 Ridge Regression
6.6 The Recursive Least—Squares Algorithm
6.7 Newton's Iterative Minimization Method
6.7.1 RLS and Newton's Method
6.8 Steady—State Performance of the RLS
6.9 Complex—Valued Data:The Widely Linear RLS
6.10 Computational Aspects of the LS Solution
6.11 The Coordinate and Cyclic Coordinate Descent Methods
6.12 Simulation Examples
6.13 Total —Least—Squares
Problems
References
……
CHAPTER 7 Classification:A Tour of the Classics
CHAPTER 8 Parameter Learning:A Convex Analytic Path
CHAPTER 9 Sparsity—Aware Learning:Concepts and Theoretical Foundations
CHAPTER 10 Sparsity—Aware Learning:Algorithms and Applications
CHAPTER 11 Learning in Reproducirg Kernel Hilbert Spaces
CHAPTER 12 Bayesian Learning:Inference and the EM Algorithm
CHAPTER 13 Bayesian Learning:Approximate Inference and Nonparametric Models
CHAPTER 14 Monte Carlo Methods
CHAPTER 15 Probabilistic Graphical Models:Part Ⅰ
CHAPTER 16 Probabilistic Graphical Models:Part Ⅱ
CHAPTER 17 Particle Filtering
CHAPTER 18 Neural Networks and Deep Learning
CHAPTER 19 Dimensionality Reduction
APPENDIX A Linear Algebra
APPENDIX B Probability Theory and Statistics
APPENDIX C Hints on Constrained Optimization
Index
作者介紹
Sergios Theodoridis希臘雅典大學信息系教授。主要研究方向是自適應信號處理、通信與模式識別。他是歐洲並行結構及語言協會(PARLE-95)的主席和歐洲信號處理協會(EUSIPCO-98)的常務主席、《信號處理》雜誌編委。
Konstantinos Koutroumbas 1995年在希臘雅典大學獲得博士學位。自2001年起任職於希臘雅典國家天文台空間應用研究院,是國際知名的專家。