Thu Dec 02, 2021 12:26 pm
Login Register Lost Password? Contact Us


How to create a matrix based on a CSV file?

Topics related to the set of Machine Learning libraries and Matrix processing algorithms

Wed Apr 11, 2012 1:25 am Change Time Zone

Hello,

I'd like to read a matrix from a CSV file.

CSV format
row, col, value =>
1, 2, 10
1, 3, 14
...

However, I don't know how to define a record for the file and convert it to matrix format used in ML.

Thank you.
jhr1021
 
Posts: 1
Joined: Wed Apr 04, 2012 3:24 pm

Wed Apr 11, 2012 1:26 pm Change Time Zone

jhr,

there are two ECL macros, which will do the conversion to and from the internal matrix format representation for you.

For example, with an inline dataset, if you wanted to create a matrix with 4 columns and a REAL in each cell, you could define a record layout like the one below,where the first column contains a row ID, and each subsequent column contains a value (I use generic column names for clarity in this example, but you should probably use better mnemonics there):

MyRecordLayout := RECORD
UNSIGNED RowId;
REAL column1;
REAL column2;
REAL column3;
REAL column4;
END;

And then use that record layout for your dataset, as I do below with this inline dataset definition (you could be loading the CSV file from the filesystem, if you wanted):

X2 := DATASET([
{1, 1, 5, 2.4, 5.2},
{2, 5, 7, 9.7, 1.4},
{3, 8, 1, 3.3, 6.1},
{4, 5, 2, 9.5, 3.2},
{5, 9, 3, 8.9, 1.7},
{6, 1, 4, 1.1, 2.8},
{7, 9, 4, 2.4, 6.8}], MyRecordLayout);

And last, but not least, you can call ml.ToField() to convert your dataset into the internal matrix representation format (the first parameter is the input dataset name and the second is the output dataset in the internal matrix format):

ml.ToField(X2,fX2);

After processing, you can convert back from the internal format to your original record layout using ml.FromField():

ml.FromField(fX3.result(), MyRecordLayout, X3);

I hope this example is clear, but please chime in if you need more help.

Thanks,

Flavio
flavio
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 73
Joined: Wed Apr 27, 2011 8:59 pm

Wed Apr 11, 2012 3:35 pm Change Time Zone

Good news - your data is already in the right format! If you look in the Types attribute of the MAT sub-directory you will find the matrix format.

Code: Select all
EXPORT Element := RECORD 
   t_Index x; // X is rows
   t_Index y; // Y is columns   
   t_value value;
END;


You should be able to read your data in directly to that format using the CSV attribute on a DATASET statement.

jhr1021 wrote:Hello,

I'd like to read a matrix from a CSV file.

CSV format
row, col, value =>
1, 2, 10
1, 3, 14
...

However, I don't know how to define a record for the file and convert it to matrix format used in ML.

Thank you.
dabayliss
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 109
Joined: Fri Apr 29, 2011 1:35 pm


Return to Machine Learning

Who is online

Users browsing this forum: No registered users and 1 guest

cron