Roxie throughput on AWS

We are performance testing a simple Roxie query on AWS. The query itself simply extracts one row from a logical file with 100 rows based on an unique identifier.
The amount of data in one row is around 100 KB.

The issue we are seeing is that the throughput is very low. It is only 2.5 queries per second. I am sure this is below par performance. Can anyone through any light on
this? Could we be seeing this because we are running on a shared environment( AWS)?

The CPU usage shoots to 100% the moment we have two concurrent threads(Apache Jmeter) running.


nodes in roxie cluster : 1

node details:

cores - 2 virtual cores
RAM 7.5 GB
Disk 850 GB

Performance test results:(Attached)
OnlineRetrievalPerf.png (26.46 KiB) Viewed 2134 times
From our HPCC team:

Roxie itself is not a very CPU intensive process, disk I/O yes but not CPU. So if this is simply an index hit/data retrieval then something is amiss. If you are doing quite a bit of post data processing using ECL then that could also be the issue.

That said, performance on AWS is slower than a standard cluster, but it should be better than 250kb/sec.

The CPU peg is suspect. Which process is taking the CPU when pegged? Is it CCD?

Run a 'top' program on the node during this test to see which process is consuming all the CPU.

If it is CCD then we need to look at the ECL. If it apache or some java app then there is not much we can do there.


I will run the tests again and get back.

