Mon Oct 18, 2021 8:29 am
Login Register Lost Password? Contact Us


Spray CSV for a tab delimited file

Questions around writing code and queries

Mon Jul 25, 2011 9:08 pm Change Time Zone

I'm spraying a tab delimited file. From the [Spray CSV] page I select "\t" in the Separator field to indicate that the file is delimited by tabs. I use the file in ECL with this code:

Code: Select all
m_Dataset := DATASET(m_Name, m_Format,   CSV(HEADING(1))   );


My test file has 26 fields and the above code splits fields properly (I see data in all 26 columns of the Result pane) until I perform an operation that iterates over the entire file, at which point it seems to revert to comma delimited splitting behavior (the entire line is dumped into columns 1 through 3 depending on how many commas were found).

When I explicitly define the separator in ECL code it does split the fields as expected:
Code: Select all
m_Dataset := DATASET(m_Name, m_Format,   CSV(SEPARATOR('\t'))    );


Is this user error or a bug or just one of those things?

Thanks!

[edit]
Sorry, I read this again and realized I need to define my question a little better:

Must I explicitly define the dataset using CSV(SEPARATOR('\t')) in ECL when the Spray CSV operation already has "\t" in the Separator field?

Just curious, thanks again!
aintnomyth
 
Posts: 86
Joined: Wed Jul 13, 2011 7:40 pm

Tue Jul 26, 2011 3:45 pm Change Time Zone

Must I explicitly define the dataset using CSV(SEPARATOR('\t')) in ECL when the Spray CSV operation already has "\t" in the Separator field?


In a word, Yes.

Spray is an operation that only gets the data file into the system so you may use it. It is part of the Distributed File Utility -- part of the infrastructure of the HPCC. The Spray CSV page can spray -any- variable-length data file (with a record delimiter), not just field-delimited files like CSV or tab-delimited.

The ECL definition of the file needs to define the file as it is on disk. That's why the DATASET must specify the SEPARATOR if it is anything other than the default comma.

HTH,

Richard
richard.taylor@lexisnexis.com
 
Posts: 11
Joined: Wed Jun 15, 2011 6:00 pm

Thu Jul 28, 2011 8:21 am Change Time Zone

In a word, No (!)

If the separator/quote/terminator are not specified in the ECL and the file was sprayed, then it should pick up the settings from the spray.

It looks like it might be a spelling mistake in the tag name (seperate v separate) that may be causing the problem. I'll open a bug.
ghalliday
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 198
Joined: Wed May 18, 2011 9:48 am

Thu Jul 28, 2011 2:15 pm Change Time Zone

Thanks for the replies, I'll specify the delimiter in ECL for now.
aintnomyth
 
Posts: 86
Joined: Wed Jul 13, 2011 7:40 pm


Return to Programming

Who is online

Users browsing this forum: No registered users and 1 guest

cron