Why does the "parquetwrite" function yield an error when used to write a column table containing mixed heterogenous primitive types in MATLAB R2023a?

5 vues (au cours des 30 derniers jours)
I am trying to write a single column table of mixed primitive types wrapped in cells to a parquet file. Since everything in the table is still a "cell", my understanding is that this should work. My code is as follows:
cellTable = table({1, [1,2,3], "hello", ["hi", "bye"]}')
parquetwrite(filename, cellTable)
On running the above code, I get this error message:
Error using parquetwrite T.Var1{3} is a string array. Based on T.Var1{1}, expected either a double array or a scalar <missing> value.
Is this behavior expected?

Réponse acceptée

MathWorks Support Team
MathWorks Support Team le 16 Août 2023
The "parquetwrite" function is working as intended.
It is not possible to write mixed primitive types in one variable to a Parquet column. This is because Parquet columns are strongly-typed, and you cannot write heterogenous primitive data to one column. When using cell arrays, we map them to Parquet LIST type, which are represented by two arrays in Parquet: the data array and an index array. The index array tells you how to partition the data array into rows. 
For instance, if you have this cell array:
>> cellArray =
  3×1 cell array
    {[    1]}
    {[2 3 4]}
    {[  5 6]}
 This gets mapped to a Parquet LIST column with this data array and index array:
>> data = [1 2 3 4 5 6]
>> index = [0 1 4 6] % note arrow uses 0-based indexing
In other words, the data is stored in a contiguous array, so it cannot contain mixed primitive types. This is why we cannot write a cell array containing both doubles and strings to a Parquet column.

Plus de réponses (0)

Catégories

En savoir plus sur Tables dans Help Center et File Exchange

Produits


Version

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by