問題描述
為什麼我在 rdd 中的 println 會打印元素字符串? (Why does my println in rdd prints the string of elements?)
當我嘗試打印我的 RDD 的內容時,它會打印如下所示的內容,我該如何打印內容?謝謝!
scala> lines
res15: org.apache.spark.rdd.RDD[Array[String]] = MapPartitionsRDD[3] at filter at <console>:23
scala> lines.take(5).foreach(println)
[Ljava.lang.String;@6d3db5d1
[Ljava.lang.String;@6e6be45e
[Ljava.lang.String;@6d5e0ff4
[Ljava.lang.String;@3a699444
[Ljava.lang.String;@69851a51
參考解法
方法 1:
This is because it uses the toString
implementation for the given object. In this case Array
prints out the type and hash. If you convert it to a List
then it will be a prettier output due to List
's toString
implementation
scala>println(Array("foo"))
[Ljava.lang.String;HASH
scala>println(Array("foo").toList)
List(foo)
方法 2:
Depending on how you want to print them out you can change your line that prints the elements to:
scala> lines.take(5).foreach(indvArray => indvArray.foreach(println))
(by user3180835、Justin Pihony、Greg)