Sat Apr 21, 2018 3:13 pm
Login Register Lost Password? Contact Us


Disk Capacity Planning

Post questions specific to installation or configuration for the HPCC Systems platform

Tue Apr 03, 2018 3:19 pm Change Time Zone

Hello!

I'm trying to figure out the "data load" vs. configuration on a Thor cluster to plan for disk capacity. Think AWS EC2 instances using EBS volumes for each prone-to-get-huge folders like hpcc-data on slave nodes.
IO need is different between Master/Slave/Support, so if possible, I'd rather not pay extra $$$ for say Support disks and rather allocate those $$$ on Slaves' disks to get better IO (separate IO capacity between actual data read/write and mirroring read/writes, and even yet a separate IO capacity for mythor/temp for instance).

Here is my current understanding of folders on each node type. Please correct me where I'm wrong.

For Slave nodes, /var/lib/HPCCSystems/hpcc-data and hpcc-mirror are used to store Logical Files (and replicas).
The folder /var/lib/HPCCSystems/mythor will host ????. I see core.* files up to 4GBs in there, and mythor/temp can get quite huge (right now I have 129GB in there).

For Master node, /var/lib/HPCCSystems/myeclccserver gets big overtime as it stores workunits (one .so file per workunit, is that right?).
Folders /var/lib/HPCCSystems/hpcc-data and hpcc-mirror don't seem to store actual Logical Files and do not follow Slaves' equivalent.

For Support nodes, I don't know what's stored there (I see hpcc-data, mydafilesrv and myesp, none are meaningfully big.).

For all nodes (or is it just Master?), the logs can also get quite big (with skewed JOINs I believe it becomes very verbose and drastically increase logs).

Am I missing anything? While hpcc-data and hpcc-mirror (on slave nodes) is easy to grasp, I bumped into a disk space issue because mythor/temp overfilled the disk and query crashed. Runtime is more difficult to figure out, is what I'm saying.

Now when trying to create mounts for some of those folders, some things don't work any longer.
hpcc-data and hpcc-mirror are "symbolic link" friendly. But not so much for mythor on the Master (something to do with .sentinel file???). If I create a symbolic link for /var/lib/HPCCSystems/mythor (to say /mnt/mythor), hpcc-init says it can't start mythor.
Which leads to yet another question: I see in the environment.xml that some folders are configured at the <Instance> level, while others are in the <Directories> element. What's the difference between the two?
I see <Roxie> can specify its own "directory" (attribute), yet <Thor> cannot (or at least I don't see it...haven't checked the XSD...sorry). I could mount an EBS volume directly into /var/lib/HPCCSystems/mythor, but I'd rather not if possible (trying to keep all mounts in same folder and symlink those).



Thank you for your help!
Luke.
lpezet
 
Posts: 51
Joined: Wed Sep 10, 2014 3:14 am

Thu Apr 05, 2018 3:12 pm Change Time Zone

Here is my current understanding of folders on each node type. Please correct me where I'm wrong.

For Slave nodes, /var/lib/HPCCSystems/hpcc-data and hpcc-mirror are used to store Logical Files (and replicas).
The folder /var/lib/HPCCSystems/mythor will host ????. I see core.* files up to 4GBs in there, and mythor/temp can get quite huge (right now I have 129GB in there).


In reality the large core files don't actually take up as much space as their size might suggest, because they are sparse files.
du corefile will reveal the actual disk space consumed.
'mythor' is the component instance directory and is only used to hold it's configuration and temporary small runtime files.
And as you've noted core files if they are created, although you can configure core file behaviour and location in Linux so they could end up elsewhere.

The instance temp directory (e.g. mythor/temp), holds internal spilling files, e.g. when performing a sort or join that is larger than memory, the engines spill intermediate results to this directory. NB: Thor master will not use this temp directory for spilling, so it's size on the Thor master node should be insignificant.

NB: These directories can be reconfigured to other locations via the Directories section in the enviroinment, e.g. the default for the component temp directory is : <Category dir="/home/jsmith/hpccdeb/var/lib/[NAME]/[INST]/temp" name="temp"/>

hpcc-data and hpcc-mirror on the slave nodes are as you say, the root level storage areas for logical file parts.

For Master node, /var/lib/HPCCSystems/myeclccserver gets big overtime as it stores workunits (one .so file per workunit, is that right?).
Folders /var/lib/HPCCSystems/hpcc-data and hpcc-mirror don't seem to store actual Logical Files and do not follow Slaves' equivalent.

the eclccserver needn't necessarily be on the Thor master node, workunit query dll's and potentially other intermediate files will build up there, however Sasha should be configured to automatically archive old workunits, removing them from Dali and the related disk files, e.g. in the myeclccserver directory.
That does mean however, that the Sasha folders (/var/lib/HPCCSystems/hpcc-data/sasha/Archive will grow.

For Support nodes, I don't know what's stored there (I see hpcc-data, mydafilesrv and myesp, none are meaningfully big.).

As mentioned above, the 'sasha' directory will grow to be substantially big overtime.
If hthor jobs are being executed then files are written to '/var/lib/HPCCSysytems/hpcc-data/eclagent', so that can be arbitrarily large also.
The dali data directory (e.g. /var/lib/HPCCSystems/hpcc-data/dali) will also grow overtime, since it keeps copies/snapshots of the database over time. The number of copies it keeps can be configured with the 'keepStores' setting.

For all nodes (or is it just Master?), the logs can also get quite big (with skewed JOINs I believe it becomes very verbose and drastically increase logs).

Am I missing anything? While hpcc-data and hpcc-mirror (on slave nodes) is easy to grasp, I bumped into a disk space issue because mythor/temp overfilled the disk and query crashed. Runtime is more difficult to figure out, is what I'm saying.

hpcc-data , hpcc-mirror and the temp directory will be the main consumers. You could reconfigure the temp directory to be under hpcc-data for simplicity.

Now when trying to create mounts for some of those folders, some things don't work any longer.
hpcc-data and hpcc-mirror are "symbolic link" friendly. But not so much for mythor on the Master (something to do with .sentinel file???). If I create a symbolic link for /var/lib/HPCCSystems/mythor (to say /mnt/mythor), hpcc-init says it can't start mythor.

I'm not sure why that would be.. I'd have to study the init logs to see why it's failing, but I wouldn't suggest relocating the instance directory itself as it should be tiny except the temp directory which can be independently reconfigured.

Which leads to yet another question: I see in the environment.xml that some folders are configured at the <Instance> level, while others are in the <Directories> element. What's the difference between the two?
I see <Roxie> can specify its own "directory" (attribute), yet <Thor> cannot (or at least I don't see it...haven't checked the XSD...sorry). I could mount an EBS volume directly into /var/lib/HPCCSystems/mythor, but I'd rather not if possible (trying to keep all mounts in same folder and symlink those).


It should be possible to reconfigured the locations of everything via the <Directories> section, but the instance directories seem to bypass these directives.
However I think the rest can be via the Directories section.
You can certainly use this section to relocate the main data directories and temp directories though.
jsmith
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 70
Joined: Tue Jul 19, 2011 12:58 pm

Mon Apr 09, 2018 4:30 pm Change Time Zone

Sorry, I made some assumption as to how the cluster was deployed.

This is extremely helpful. Thanks a ton jsmith!
lpezet
 
Posts: 51
Joined: Wed Sep 10, 2014 3:14 am


Return to Installation

Who is online

Users browsing this forum: No registered users and 1 guest

cron