Sun Aug 19, 2018 4:04 pm
Login Register Lost Password? Contact Us


Spray data directly from S3 to Thor

Questions or comments related to Cloud Computing and the HPCC Systems Instant Cloud for AWS

Fri Feb 24, 2017 9:07 pm Change Time Zone

Hi,

We are trying to run Thor applications over at least several TB input dataset. We plan to use S3 as the persistent storage and provision the HPCC cluster with a size that fits the application requirement. Since the cluster size may change, simply backing up data to and from S3 might not work. However, fetching data to the landing zone and then spraying the data seems not efficient.

We found a tool, rnet-parspray that can spray data directly from S3 to the Thor cluster. This script supports only zipped XML files. Before we jump into any implementation, just want to hear any feedback on this scenario. Any suggestion on the best practice of running HPCC on AWS? Any good ways to handle large dataset on AWS? Thanks.

-chin
chsu6
 
Posts: 8
Joined: Fri Feb 19, 2016 9:58 pm

Return to Cloud

Who is online

Users browsing this forum: No registered users and 1 guest

cron