現有表的 Hive 分桶和分區 (Hive bucketing and partition for existing table)

問題描述

是否可以為已經包含數據的表創建分桶和分區？我在配置單元中有一個超過 100M 記錄的表，我想在表上創建一個分區。我還需要創建分桶。

有可能嗎？

謝謝，巴拉

參考解法

方法 1:

No, it's not possible to alter bucketing and partitioning within a preloaded table, you may have to create a new table with required bucketing and partitioning properties and then load it from the old table.

set hive.enforce.bucketing = true;
FROM old_table insert into table new_bucketed_partitioned_table select * ;

(by Venkadesh Venkat、Aman Mundra)

參考文件

Hive bucketing and partition for existing table (CC BY‑SA 2.5/3.0/4.0)

現有表的 Hive 分桶和分區 (Hive bucketing and partition for existing table)

問題描述

參考解法

方法 1:

參考文件

相關問題

留言討論

現有表的 Hive 分桶和分區 (Hive bucketing and partition for existing table)

問題描述

參考解法

方法 1:

參考文件

相關問題

hadoop -libjars dan ClassNotFoundException (hadoop -libjars and ClassNotFoundException)

基於感興趣的日期範圍作為參數輸入限制在 Pig Latin 中加載日誌文件 (Restricting loading of log files in Pig Latin based on interested date range as parameter input)

選擇 MapReduce 設計模式 (Choosing a MapReduce Design Pattern)

Lỗi phân vùng tùy chỉnh (Custom Partitioner Error)

Connection Refused - 為什麼 zookeeper 嘗試連接到 localhost 而不是服務器 ip (Connection Refused - Why does zookeeper tries to connect to localhost instead of a server ip)

現有表的 Hive 分桶和分區 (Hive bucketing and partition for existing table)

如何在 R 中讀取 HDFS 中的文件而不會丟失列名和行名 (How to read files in HDFS in R without loosing column and row names)

CDH 網絡接口速度抑制 (CDH Network Interface Speed Suppress)

Apache Apex 是依賴 HDFS 還是有自己的文件系統？ (Does Apache Apex rely on HDFS or does it have its own file system?)

java.io.IOException：作業失敗！使用 hadoop-0.19.1 在我的 osx 上運行示例應用程序時 (java.io.IOException: Job failed! when running a sample app on my osx with hadoop-0.19.1)

如何使用 PIG 腳本驗證列表 (How to validate a list using PIG script)

使用 spark-submit 為 Spark Job 設置 HBase 屬性 (set HBase properties for Spark Job using spark-submit)

留言討論