Tue Oct 19, 2021 12:17 am
Wed Jun 29, 2011 2:59 pm

I have roughly 10 TB worth of data in plaintext format, What kind of hardware do I need to host on HPCC?
Wed Jun 29, 2011 4:05 pm

That sort of depends on what you want to do to the data. If you plan on indexing it, or linking it in certain ways which causes the data to grow (one to many relationships, etc), then you will need to size up accordingly. But I would recommend a good starting point is to have 2-3 times the amount of storage as you have data volume.

The other factor in sizing is speed in which you want to run certain tasks. Obviously the more # of nodes, the faster a particular job will perform.

The nice thing about HPCC is that if you find that you need additional space or more performance, you simply scale linearly.

So for instance while you can have a single server with 30TB of space, performance would be on par or possibly even less than a normal relationship database (think Oracle). However a much better solution for HPCC would be to have 20 servers each with 2TB of space (giving a total of 40TB usable). Now you have the potential of many cores, more RAM, more RAID controllers, cache, etc to process your data in a suitable timeframe. Want it faster? Scale it to 50 nodes, or 100, or 400.

Jon Burger
Manager, HPCC Engineering Team
