Hadoop Data Integration
The HDFS to HPCC (H2H) Connector allows read/write access to HDFS data repositories.
Please note that the H2H Connector is no longer actively supported & the assets provided on this page are offered for archival purposes.
This connector provides access to HDFS data files from HPCC. The connector provides ECL macros to easily populate ECL datasets of the HPCC Systems platform with HDFS data, and the ability to create HDFS files based on HPCC data. A libhdfs (native API provided by Hadoop) based connector, and a webhdfs (web based API provided by Hadoop) implementation are provided. Please view the archive for previous versions.
H2H Connector library (libhdfs based)
Release | Size* | Version | |
HDFS Connector Centos5 Release Date: 02/22/2013 Centos 64bit |
31KB | 1.4.4-1 |
MD5:5ea5103919c2d08752dcce043028e6c0
|
HDFS Connector Centos6 Release Date: 02/22/2013 Centos 64bit |
31KB | 1.4.4-1 |
MD5: 0c78c4eb94c3cba7d306db8361c44e8d
|
HDFS Connector Ubuntu 10.04 Release Date: 03/14/2014 Ubuntu 64bit |
27KB | 1.4.4-1 |
MD5: e53c6f2781a21475db25495dba0d3aed
|
HDFS Connector Ubuntu 12.04 Release Date: 03/14/2014 Ubuntu 64bit |
27KB | 1.4.4-1 |
MD5: cb1d95f208464d364d8a75396a68f2c9
|
HDFS Connector Ubuntu 12.10 Release Date: 03/14/2014 Ubuntu 64bit |
28KB | 1.4.4-1 |
MD5: 04a19606165f580996db5618030ad946
|
HDFS Connector Ubuntu 13.04 Release Date: 03/14/2014 Ubuntu 64bit |
28KB | 1.4.4-1 |
MD5: ef6cd32a219fe914cda2769dd7d06430
|
HDFS Connector Ubuntu 13.10 Release Date: 03/14/2014 Ubuntu 64bit |
28KB | 1.4.4-1 |
MD5: be305d7cf8d25f2a88290d97eeb6df74
|
H2H Connector library (webhdfs based)
Release | Size* | Version | |
WebHDFS Connector Centos5 Release Date: 02/22/2013 Centos 64bit |
27KB | 1.4.4-1 |
MD5: d229a06032d8fda43e420092242dc836
|
WebHDFS Connector Centos6 Release Date: 02/22/2013 Centos 64bit |
30KB | 1.4.4-1 |
MD5: e6b6694f91278b654789cbc0ae5cd628
|
WebHDFS Connector Ubuntu 10.04 Release Date: 03/14/2014 Ubuntu 64bit |
26KB | 1.4.4-1 |
MD5: a05a5e78fcb61cc2bbd74800086e835e
|
WebHDFS Connector Ubuntu 12.04 Release Date: 03/14/2014 Ubuntu 64bit |
26KB | 1.4.4-1 |
MD5: 895ccdf844c29cdd857ced966a6e3c6e
|
WebHDFS Connector Ubuntu 12.10 Release Date: 03/14/2014 Ubuntu 64bit |
26KB | 1.4.4-1 |
MD5: c233125bad952bfd4239d87cd757aa41
|
WebHDFS Connector Ubuntu 13.04 Release Date: 03/14/2014 Ubuntu 64bit |
26KB | 1.4.4-1 |
MD5: 020ecbc43eb9d38b4a016c10df9ce351
|
WebHDFS Connector Ubuntu 13.10 Release Date: 03/14/2014 Ubuntu 64bit |
26KB | 1.4.4-1 |
MD5: 89d242ca1df67136ded1f30dfaee4353
|
* sizes are approximate
- HDFS Connector Library for IDE (ZIP)
- HDFS to HPCC Connector Doc (PDF)
- Listen to the libhdfs H2H Podcast
*****************************************************************
Known Limitations and Release Notes for H2H 1.4.4-1 and Web H2H 1.4.4-1
- Due to recent changes, it is required that you uninstall any previous versions of H2H before installing 1.x releases.
- LibHDFS based connector requires libhdfs.so, which requires local installation of Hadoop
- WebHDFS based conector package notes:
- Requires libcurl
- Webhdfs must be enabled on target HDFS system
- Target HDFS datanode hostnames must be resolvable locally (migh require adding entries in local hosts file)
- PipeOutAndMerge not supported, only PipeOut. User responsible for merging file parts on Hadoop side
- When installing the rpm (centos and opensuse) use the following command to install the plugin:
sudo rpm -Uvh --nodeps hpccsystems-.rpm
- We have seen some issues with CSV text qualifiers (or escape characters). If your data has field values containing escape characters, in rare cases, your data may not PipeIn correctly (there could be data corruption and/or record loss in the resulting dataset). This doesn’t affect your original data.
- Occasionally, we have seen instances where the LD Library path is not set up correctly. This causes an error when the libjvm.so cannot be found. Follow the steps in the “HDFS to HPCC Connector” Document in the “Editing and distributing the Configuration file” section If you get the following error:
…error while loading shared libraries: libjvm.so: cannot open shared object file: No such file or directory
============================================================ Comprehensive list of changes from H2H 1.4.2-1 to 1.4.4-1 ============================================================ HH-84 Add support for hdfsuser on read operations HH-86 Pull error messages from master branch to 1.4.4