Sat Mar 24, 2018 1:49 pm
Login Register Lost Password? Contact Us

Spray data directly from S3 to Thor

Questions or comments related to Cloud Computing and the HPCC Systems Instant Cloud for AWS

Fri Feb 24, 2017 9:07 pm Change Time Zone


We are trying to run Thor applications over at least several TB input dataset. We plan to use S3 as the persistent storage and provision the HPCC cluster with a size that fits the application requirement. Since the cluster size may change, simply backing up data to and from S3 might not work. However, fetching data to the landing zone and then spraying the data seems not efficient.

We found a tool, rnet-parspray that can spray data directly from S3 to the Thor cluster. This script supports only zipped XML files. Before we jump into any implementation, just want to hear any feedback on this scenario. Any suggestion on the best practice of running HPCC on AWS? Any good ways to handle large dataset on AWS? Thanks.

Posts: 8
Joined: Fri Feb 19, 2016 9:58 pm

Return to Cloud

Who is online

Users browsing this forum: No registered users and 1 guest