Tue Oct 26, 2021 12:56 pm
Login Register Lost Password? Contact Us


Issue with the H2H connector

Topics related to the Hadoop Connector or migrating data from Hadoop

Thu Jul 26, 2012 10:02 pm Change Time Zone

The H2H documentation seems to imply that you can set the CSV format parameters in the PipeIn command:

Code: Select all
DataConnectors.HDFSConnector.PipeIn(MyDataFile,
'/user/Administrator/test/MyData1',
Layout_CSV, CSV(SEPARATOR('|')),
'192.168.56.120',
54310);


However I have found that in the PipeIn macro the only CSV parameter that actually is looked at is terminator and the rest are thrown away. I have a file where all the fields are quoted with double quotes and the PipeIn is not detecting that and thus all my fields in my result dataset from the pipein retain the double quotes. Thanks.

- Greg
gkrasnow
 
Posts: 18
Joined: Tue Jun 19, 2012 4:25 pm

Fri Jul 27, 2012 2:09 pm Change Time Zone

Hi Greg, thanks for contacting us regarding your concern.

If you're using double quote to encapsulate the contents of all your fields have you tried setting the QUOTE attribute? CSV(SEPARATOR('|'), QUOTE('\"'))

Anyway, the CSV attributes are acknowledged by the ECL PIPE command as passed in through the "HadoopFileFormat" parameter.

If you looked at the PipeIn macro, the "TERMINATOR" attribute is explicitly gleaned and passed to the "hdfspipe" command, but the entire "HadoopFileFormat" is passed in to PIPE().

Let me know if that helps. Thanks.
-Rodrigo
rodrigo.pastrana@lexisnexis.com
 
Posts: 22
Joined: Fri Oct 07, 2011 2:24 pm


Return to From Hadoop

Who is online

Users browsing this forum: No registered users and 1 guest