自動機器學習 (AutoML):方法、系統與挑戰

自動機器學習 (AutoML):方法、系統與挑戰

作者: Frank Hutter Lars Kotthoff Joaquin Vanschoren 何明 劉淇 譯
出版社: 清華大學
出版在: 2020-11-01
ISBN-13: 9787302552550
ISBN-10: 730255255X
裝訂格式: 平裝
總頁數: 256 頁





內容描述


本書全面介紹自動機器學習,主要包含自動機器學習的方法、
實際可用的自動機器學習系統及目前所面臨的挑戰。
在自動機器學習方法中,本書涵蓋超參優化、元學習、神經網絡架構搜索三個部分,
每一部分都包括詳細的內容介紹、原理解讀、具體運用方法和存在的問題等。
此外,本書還具體介紹了現有的各種可用的AutoML系統,
如Auto-sklearn、Auto-WEKA及Auto-Net等,
並且本書最後一章詳細介紹了具有代表性的AutoML挑戰賽及挑戰賽結果背後所蘊含的理念,
有助於從業者設計出自己的AutoML系統。 
本書英文版是國際上第一本介紹自動機器學習的英文書,內容全面且翔實,
尤為重要的是涵蓋了最新的AutoML領域進展和難點。
本書作者和譯者學術背景扎實,保證了本書的內容質量。 
對於初步研究者,本書可以作為其研究自動機器學習方法的背景知識和起點;
對於工業界從業人員,本書全面介紹了AutoML系統及其實際應用要點;
對於已經從事自動機器學習的研究者,本書可以提供一個AutoML最新研究成果和進展的概覽。
總體來說,本書受眾較為廣泛,既可以作為入門書,也可以作為專業人士的參考書。


目錄大綱


目 錄
自動機器學習方法
第1章 超參優化 ··································2
1.1 引言 ··············································2
1.2 問題定義 ·······································4
1.2.1 優化替代方案:集成與邊緣化 ·············5
1.2.2 多目標優化 ···········································5
1.3 黑盒超參優化 ·······························6
1.3.1 免模型的黑盒優化方法 ························6
1.3.2 貝葉斯優化 ···········································8
1.4 多保真度優化 ······························13
1.4.1 基於學習曲線預測的早停法 ··············14
1.4.2 基於Bandit的選擇方法 ·····················15
1.4.3 保真度的適應性選擇 ··························17
1.5 AutoML的相關應用 ····················18
1.6 探討與展望 ··································20
1.6.1 基準測試和基線模型 ··························21
1.6.2 基於梯度的優化 ··································22
1.6.3 可擴展性 ·············································22
1.6.4 過擬合和泛化性 ··································23
1.6.5 任意尺度的管道構建 ··························24
參考文獻··············································25
第2章 元學習 ···································36
2.1 引言 ·············································36
2.2 模型評估中學習 ··························37
2.2.1 獨立於任務的推薦 ······························38
2.2.2 配置空間的設計 ··································39
2.2.3 配置遷移 ·············································39
2.2.4 學習曲線 ·············································42
2.3 任務特性中學習 ··························43
2.3.1 元特徵 ·················································43
2.3.2 元特徵的學習 ·····································44
2.3.3 基於相似任務熱啟動優化過程 ···········46
2.3.4 元模型 ·················································48
2.3.5 管道合成 ·············································49
2.3.6 調優與否 ·············································50
2.4 先前模型中學習 ··························50
第一篇
2.4.1 遷移學習 ·············································51
2.4.2 針對神經網絡的元學習 ······················51
2.4.3 小樣本學習 ·········································52
2.4.4 不止於監督學習 ··································54
2.5 總結 ·············································55
參考文獻···············································56
第3章 神經網絡架構搜索 ··················68
3.1 引言 ·············································68
3.2 搜索空間 ······································69
3.3 搜索策略 ······································73
3.4 性能評估策略 ······························76
3.5 未來方向 ······································78
參考文獻···············································80
第4章 Auto-WEKA ···························86
4.1 引言 ·············································86
4.2 準備工作 ······································88
4.2.1 模型選擇 ·············································88
4.2.2 超參優化 ·············································88
4.3 算法選擇與超參優化結合(CASH) ···································89
4.4 Auto-WEKA ·································91
4.5 實驗評估 ······································93
4.5.1 對比方法 ·············································94
4.5.2 交叉驗證性能 ·····································96
4.5.3 測試性能 ·············································96
4.6 總結 ·············································98
參考文獻···············································98
第5章 Hyperopt-sklearn ·················101
5.1 引言 ···········································101
5.2 Hyperopt背景 ····························102
5.3 Scikit-Learn模型選擇 ···············103
5.4 使用示例 ····································105
5.5 實驗 ···········································109
5.6 討論與展望 ································111
5.7 總結 ···········································114
參考文獻·············································114
第6章 Auto-sklearn ························116
6.1 引言 ···········································116
6.2 CASH問題 ································118
6.3 改進 ···········································119
6.3.1 元學習步驟 ········································119
6.3.2 集成的自動構建 ································121
6.4 Auto-sklearn系統 ······················121
6.5 Auto-sklearn的對比試驗 ···········125
6.6 Auto-sklearn改進項的評估 ·······127
6.7 Auto-sklearn組件的詳細分析 ···129
6.8 討論與總結 ································134
6.8.1 討論 ···················································134
第二篇XVII
6.8.2 使用示例 ···········································134
6.8.3 Auto-sklearn的擴展 ··························135
6.8.4 總結與展望 ·······································136
參考文獻·············································136
第7章 Auto-Net ······························140
7.1 引言 ···········································140
7.2 Auto-Net 1.0 ·······························142
7.3 Auto-Net 2.0 ·······························144
7.4 實驗 ···········································151
7.4.1 基線評估 ···········································151
7.4.2 AutoML競賽上的表現 ·····················152
7.4.3 Auto-Net 1.0與Auto-Net 2.0的對比····154
7.5 總結 ···········································155
參考文獻·············································156
第8章 TPOT ··································160
8.1 引言 ···········································160
8.2 方法 ···········································161
8.2.1 機器學習管道算子 ····························161
8.2.2 構建基於樹的管道 ····························162
8.2.3 優化基於樹的管道 ····························163
8.2.4 基準測試數據 ···································163
8.3 實驗結果 ····································164
8.4 總結與展望 ································167
參考文獻·············································168
第9章 自動統計 ······························170
9.1 引言 ···········································170
9.2 自動統計項目的基本結構 ·········172
9.3 應用於時序數據的自動統計 ·····173
9.3.1 核函數上的語法 ································173
9.3.2 搜索和評估過程 ································175
9.3.3 生成自然語言性的描述 ····················175
9.3.4 與人類比較 ·······································177
9.4 其他自動統計系統 ····················178
9.4.1 核心組件 ···········································178
9.4.2 設計挑戰 ···········································179
9.5 總結 ···········································180
參考文獻·············································180
自動機器學習挑戰賽
第10章 自動機器學習挑戰賽分析 ···186
10.1 引言··········································187
10.2 問題形式化和概述 ···················190
10.2.1 問題的範圍 ·····································190
10.2.2 全模型選擇 ·····································191
10.2.3 超參優化 ·········································192
10.2.4 模型搜索策略 ·································193
10.3 數據··········································197
10.4 挑戰賽協議 ······························201
10.4.1 時間預算和計算資源 ······················201
10.4.2 評分標準 ·········································202
10.4.3 挑戰賽2015/2016中的輪次和階段 ····205
第三篇
10.4.4 挑戰賽2018中的階段 ····················206
10.5 結果··········································207
10.5.1 挑戰賽2015/2016上的得分 ···········207
10.5.2 挑戰賽2018上的得分 ····················209
10.5.3 數據集/任務的難度 ·······················210
10.5.4 超參優化 ·········································217
10.5.5 元學習 ·············································217
10.5.6 挑戰賽中使用的方法 ······················219
10.6 討論··········································224
10.7 總結··········································226
參考文獻·············································229


作者介紹


Frank Hutter 
德國弗萊堡大學教授,機器學習實驗室負責人。
主要研究統計機器學習、知識表示、自動機器學習及其應用,
獲得第一屆(2015/2016)、第二屆(2018/2019)自動機器學習比賽的世界冠軍。

Lars Kotthoff 
美國懷俄明大學助理教授。
主要研究深度學習、自動機器學習,
致力於構建領先且健壯的機器學習系統,
領導Auto-WEKA項目的開發和維護。

Joaquin Vanschoren
荷蘭埃因霍溫理工大學助理教授。
主要研究機器學習的逐步自動化,創建了共享數據開源平台OpenML.org,
並獲得微軟Azure研究獎和亞馬遜研究獎。
譯者簡介
何明
中國科學技術大學博士,目前為上海交通大學電子科學與技術方向博士後研究人員、
好未來教育集團數據中台人工智能算法研究員。

劉淇
中國科學技術大學計算機學院特任教授,博士生導師,
中國計算機學會大數據專家委員會委員,中國人工智能學會機器學習專業委員會委員。




相關書籍

MATLAB Machine Learning Recipes: A Problem-Solution Approach, 2/e (Paperback)

作者 Michael Paluszek Stephanie Thomas

2020-11-01

深入淺出 GAN 生成對抗網絡 : 原理剖析與 TensorFlow 實踐

作者 廖茂文 潘志宏

2020-11-01

動畫圖解資料結構使用 Python

作者 李春雄

2020-11-01