Spraying Delimited CSV
I'm trying to spray a delimited file with integer values where null values equal to 0.
The "Spray Delimited" function on ECL Watch seems to be ignoring the delimiter and it seems to reading the integers as base 256 instead of base 10 and is having issues with variable length data values. I tried using a comma delimited file as well as a tab delimited file and had the same issues with both.
System Specs:
Spray Delimited Options:
HeaderFile/Layout_Header
HeaderFile/File_HeaderFile
Execute HeaderFile/File_HeaderFile
Re-uploaded and re-sprayed the file.
Re-executed HeaderFile/File_HeaderFile.
Output:
All values are incorrect and number of records has doubled.
Based on the values it seems as if each character is being imported as an integer based on its ASCII value (base 256).
Is this the correct interpretation of what is going on?
If so, is there a way to import integers without converting to base 256?
The "Spray Delimited" function on ECL Watch seems to be ignoring the delimiter and it seems to reading the integers as base 256 instead of base 10 and is having issues with variable length data values. I tried using a comma delimited file as well as a tab delimited file and had the same issues with both.
System Specs:
- Server
- Nodes: 1
- Processor: Intel Core i5 vPro
- RAM: 4gb
- Hard Drive: 64gb SSD
- OS: Ubuntu 13.10 Server 64 bit
- Server Platform: Community 4.2.2-1 for Ubuntu 13.10 64 bit
- Client Tools: Not Installed
- Graph Control: Not Installed
- Ganglia Monitoring: Not Installed
- Client
- Manufacturer: Hewlett-Packard
- Processor: Intel Core i7
- RAM: 8gb
- Hard Drive: 64gb SSD & 500gb HDD
- OS: Windows 7 64 bit
- ECL IDE: 4.2.2-1 for Windows x86
- Internet Browser: Google Chrome x86 Version 33.0.1750.154 m
- Code: Select all
35500555,2014,4,1,10,5,302009,0,02197881679,308,956
35904820,2014,4,1,9,1,352043,0,10152186187,404,0
10078085,2014,4,1,16,3,472017,54661600,10055782374,542,920
18701205,2014,4,1,14,1,235872,0,09440308519,748,0
32303558,2014,4,1,8,1,188447,0,0,523,0
Spray Delimited Options:
- Source
- Machine/dropzone: localhost/mydropzone
- IP Address: ***.***.***.***
- Local Path: /var/lib/HPCCSystems/mydropzone/Transactions_April2014_HeaderData.csv
- Format: ASCII
- Max Record Length: 8192
- Separators: \,
- No Separator: {FALSE}
- Escape: {NULL}
- Line Terminators: \n,\r\n
- Quote: '
- Destination
- Group: mythor
- Label: Transactions::April2014::HeaderData
- Mask: Transactions::April2014::HeaderData._$P$_of_$N$
- Prefix: {NULL}
- Options
- Overwrite: {TRUE}
- No Split: {FALSE}
- Compress: {FALSE}
- ID: D20140403-073627
- ClusterName: thor
- DFUServerName: mydfuserver
- Queue: dfuserver_queue
- Protected: {FALSE}
- Command: Spray
- TimeStarted: 2014-04-03 12:36:27
- TimeStopped: 2014-04-03 12:36:30
- PercentDone: View Progress
- ProgressMessage: 100% Done, 0 secs left (252/252bytes) [1/1nodes]
- SummaryMessage: Total time taken 2 secs
- State: finished
- SourceIP: ***.***.***.***
- SourceFilePath: /var/lib/HPCCSystems/mydropzone/Transactions.April2014.HeaderData.csv
- SourceFormat: Delimited
- SourceNumParts: 1
- SourceDirectory: /var/lib/HPCCSystems/mydropzone
- SourceSeparators: \,
- SourceTerminators: \n,\r\n
- SourceQuote: '
- DestLogicalName: transactions::april2014::headerdata
- DestGroupName: mythor
- DestDirectory: /var/lib/HPCCSystems/hpcc-data/thor/transactions/april2014/
- DestNumParts: 1
- MonitorSub: 0
- Overwrite: 1
- Replicate: 0
- Compress: 0
- AutoRefresh: 0
HeaderFile/Layout_Header
- Code: Select all
EXPORT Layout_Header := RECORD
INTEGER4 TransID;
INTEGER2 tYear;
INTEGER1 tMonth;
INTEGER1 tDay;
INTEGER1 StoreID;
INTEGER1 RegID;
INTEGER3 TrxID;
INTEGER4 AcctID;
INTEGER5 CustID;
INTEGER2 Cashier;
INTEGER2 Manager;
END;
HeaderFile/File_HeaderFile
- Code: Select all
IMPORT HeaderFile;
EXPORT File_HeaderFile :=
DATASET('~transactions::april2014::headerdata',HeaderFile.Layout_Header,THOR);
Execute HeaderFile/File_HeaderFile
- Message:
- Error:
- System error: 1:
- File /var/lib/HPCCSystems/hpcc-data/thor/transactions/april2014/headerdata.csv._1_of_1
- size is 252 which is not a multiple of 26
- (0, 0)
- Code: 1
- Location: {NULL}
Re-uploaded and re-sprayed the file.
Re-executed HeaderFile/File_HeaderFile.
Output:
All values are incorrect and number of records has doubled.
Based on the values it seems as if each character is being imported as an integer based on its ASCII value (base 256).
Is this the correct interpretation of what is going on?
If so, is there a way to import integers without converting to base 256?
- kmier
- Posts: 5
- Joined: Fri Mar 28, 2014 6:32 pm
Hi,
you have sprayed a delimited file and while reading it you are using 'THOR'
This is the reason for that Error
change the code in HeaderFile/File_HeaderFile as shown below
Regards,
Sameer
you have sprayed a delimited file and while reading it you are using 'THOR'
This is the reason for that Error
change the code in HeaderFile/File_HeaderFile as shown below
- Code: Select all
IMPORT HeaderFile;
EXPORT File_HeaderFile :=
DATASET('~transactions::april2014::headerdata',HeaderFile.Layout_Header, csv(separator(',')));
Regards,
Sameer
- sameermsc
- Posts: 66
- Joined: Wed Oct 05, 2011 10:09 am
4 posts
• Page 1 of 1
Who is online
Users browsing this forum: No registered users and 1 guest