I'm working in spark and, to employ the Matrix
class of the Jama
library, I need to convert the content of a spark.sql.DataFrame
to a 2D array, i.e., Array[Array[Double]]
.
While I've found quite several solutions on how to convert a single column of a dataframe to an array, I don't understand how to
- transform an entire dataframe into a 2D array (that is, an array of arrays);
- while doing so, casting its content from long to Double.
The reason for that is that I need to load the content of a dataframe into a Jama matrix, which requires a 2D array of Doubles as input:
val matrix_transport = new Matrix(df_transport)
<console>:83: error: type mismatch;
found : org.apache.spark.sql.DataFrame
(which expands to) org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]
required: Array[Array[Double]]
val matrix_transport = new Matrix(df_transport)
EDIT: for completeness, the df schema is:
df_transport.printSchema
root
|-- 1_51501_19962: long (nullable = true)
|-- 1_51501_26708: long (nullable = true)
|-- 1_51501_36708: long (nullable = true)
|-- 1_51501_6708: long (nullable = true)
...
with 165 columns of identical type long
.