Mon Mar 19, 2018 6:46 am
Login Register Lost Password? Contact Us

storage configuration

Topics related to recommendations or questions on the design for HPCC Systems clusters

Wed Oct 01, 2014 12:43 am Change Time Zone


I am in the process of comparing Big Data systems and solutions on a cluster where each node has multiple disks that are directly accessible. As the cluster is also used for Hadoop/HDFS, the disks are mounted as JBOD; i.e., they are mounted as separate Linux volumes and are not utilizing any sort of abstraction between the OS and disk besides the filesystem itself (i.e., no RAID or LVM). For many systems I have encountered, this is one of their many acceptable hardware configurations, with this type of configuration being geared towards newer systems such as Hadoop/HDFS that take on the tasks of replication and failover in software. However, for HPCC this appears not to be a configuration where I can fully utilize the hardware, as it seems that with HPCC I must have one (and only one?!) location allocated for my data, homogeneously across the entire cluster. Using RAID is not a choice in my situation, as the cluster's hardware and OS are shared with other (Hadoop/HDFS) users and they are not mine to reconfigure. (I would expect similar situations to arise with the Big Data clusters of many enterprises today.) I am trying to understand if there is a simple way that this type of hardware configuration could be better accommodated. For example, something as simple as supporting an HPCC node process startup parameter that points at a configuration file might work. There could then be multiple processes, one per disk volume, coexisting on the same machine; this is how systems like MongoDB deal with multiple volumes, for example, when in non-RAIDed configurations.

Thank you,
Keren Ouaknine
Posts: 16
Joined: Sun Aug 14, 2011 2:30 pm

Thu Oct 02, 2014 12:32 pm Change Time Zone

Hi Keren,

I was told that some members of the HPCC team have already been working with you, and it would seem that the best course at this time would be to submit a feature request in the Community Issue Tracker That way we can track your request properly and the development team can evaluate what needs to be done.

Thank You,

Community Advisory Board Member
Community Advisory Board Member
Posts: 975
Joined: Wed Jun 29, 2011 7:13 pm

Return to Clustering

Who is online

Users browsing this forum: No registered users and 1 guest