Fri Aug 17, 2018 11:11 pm
Login Register Lost Password? Contact Us


How Roxie channels affect data/partition distribution?

Topics related to recommendations or questions on the design for HPCC Systems clusters

Fri Mar 11, 2016 6:59 am Change Time Zone

Hi,

I am a PhD student, and we are working on adding elasticity to Roxie. I am trying to understand how Roxie manage data. I setup a 4-node Thor cluster (not including the Thor master) and a 4-node Roxie cluster. To understand How Roxie works, I tried different channel mode (slaveConfig in environment.xml), including cyclic, overloaded and simple. I also tried different combinations of the following parameters.

Configurations
Cyclic mode: numDataCopies=[1, 2, 4]
Overload mode: channelsPerNode=[1, 2, 4]
Simple mode: numDataCopies[1, 2, 4]

Roxie Application
Six Degree from the official tutorial

Experiment Steps
1. Setup Roxie against one configuration combination from above
2. Upload data to landing zone
3. Spray data to Thor (observed two replicas per data partition) - see Figure 1
4. Publish the query to Roxie (observed only one replica per index partition, but metadata file is replicated to all Roxie nodes) - see Figure 2
5. Repeat the same experiment with remaining configurations

Questions
1. Does the channel mode (as described in Figure 3) affect how Roxie stores data? Our results do not indicate this relationship.
2. How a Roxie server process picks the slave process to run the Roxie query? It seems partition layout is fixed (as shown in Figure 2). The multicast channel does not help in this case?
3. Are the index files loaded to memory already? Does this mean I look the wrong place in my experiments?
4. Is it possible to know which slave nodes handle a certain query? Is the statistics (taken from the output of a query) used for the server or slave process?

<Statistic c="roxie"
count="1"
creator="myroxie@10.25.11.102"
desc="Graph graph1"
kind="TimeElapsed"
s="graph"
scope="graph1"
ts="1457679003666831"
unit="ns"
value="296060166"/>
<Statistic c="roxie"
count="1"
creator="myroxie@10.25.11.102"
kind="TimeElapsed"
s="global"
scope="workunit"
ts="1457679003666855"
unit="ns"
value="296060166"/>
<Statistic c="summary"
count="1"
creator="roxie"
desc="Total cluster time"
kind="TimeElapsed"
s="global"
scope="workunit"
ts="1457679003666871"
unit="ns"
value="296060166"/>


I would appreciate any feedback. Sorry for the very long questions.

-chin
Attachments
s1_index_on_roxie.png
Figure 2 - How index file is distributed in Roxie
s1_index_on_roxie.png (57.22 KiB) Viewed 1008 times
s1_data_on_thor.png
Figure 1 - Two replicas per data partition on Thor
s1_data_on_thor.png (58.47 KiB) Viewed 1008 times
Screen Shot 2016-03-11 at 1.24.31 AM.png
Figure 3 - cyclic mode configured in Roxie
Screen Shot 2016-03-11 at 1.24.31 AM.png (56.98 KiB) Viewed 1008 times
chsu6
 
Posts: 8
Joined: Fri Feb 19, 2016 9:58 pm

Fri Mar 11, 2016 2:12 pm Change Time Zone

Hi Chin,

Let me answer each question individually.

1. Does the channel mode (as described in Figure 3) affect how Roxie stores data? Our results do not indicate this relationship.

What you are looking at is the actual INDEX in Figure 2. Each ROXIE node divides the INDEX into four pieces, and also stores a meta-key (that's the 32K piece) on each node. Since your INDEX was a non-payload, or standard index, the DATA part that was copied to ROXIE is shown in Figure 1, and reflects what is described in Figure 3.

2. How a Roxie server process picks the slave process to run the Roxie query? It seems partition layout is fixed (as shown in Figure 2). The multicast channel does not help in this case?


A copy of the query is stored on each ROXIE node. The load balancer built in to the ROXIE software decides which Farmer will process the query, and then the Farmer will decide which Agent to use to retrieve the result. If one is currently busy, the replicated channel can sometimes be used.

3. Are the index files loaded to memory already? Does this mean I look the wrong place in my experiments?


The way that I understand it, the index files indeed are loaded into RAM at the start of the query, and remain there for its life cycle.

4. Is it possible to know which slave nodes handle a certain query? Is the statistics (taken from the output of a query) used for the server or slave process?


To be honest, I have never needed to dig this deep into the specific nodes, but I would imagine that the Ganglia monitoring tool can give you that information.

Regards,

Bob
bforeman
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 975
Joined: Wed Jun 29, 2011 7:13 pm

Fri Mar 11, 2016 3:36 pm Change Time Zone

Hi Bob,

Thank you for quick reply, and detailed description. I made a mistake to upload the wrong Figure 1 and the correct one should be Figure 4 (as attached here). In Figure 1, I manually upload the DATA part to verify data distribution. In our case, DATA is not copied to the Roxie cluster as shown in Figure 4. Does this mean the INDEX has payload?
s1_data_on_thor.png
Figure 4 - The DATA part on Thor
s1_data_on_thor.png (62.1 KiB) Viewed 975 times


With the following configuration:
1. 4-node cluster
2. channel mode: cyclic (as in Figure 3 in my previous post)
3. numChannels: 5
4. numDataCopies: 2

INDEX partition and distribution
Screen Shot 2016-03-11 at 10.24.14 AM.png
Table 1 - Sudo INDEX distribution on Roxie
Screen Shot 2016-03-11 at 10.24.14 AM.png (23.5 KiB) Viewed 975 times


Questions
Given a particular case in the following, how does Roxie select the slave process to handle a query? For example, when the server process on node 1 (n1) receives a Roxie query, the node n1 forwards this query to other slaves (via multicast channel in default setting). In this case, node 1 can communicate with only node 2 (via channel 1) and node 4 (via channel 4). What if the query involves INDEX partition i3 on node 3?


It is still not clear to me about how it works: the multicast channels and data management in Roxie. I really appreciate your response.
chsu6
 
Posts: 8
Joined: Fri Feb 19, 2016 9:58 pm

Mon Mar 14, 2016 9:11 am Change Time Zone

Data partitioning is determined by the thor used to build the data - Roxie uses the partitioning information stored in the top level key to decide which channel(s) to communicate with to retrieve data. Each roxie channel will handle a number of index file parts, and each channel may be implemented by several roxie slave nodes.

Indexes are not loaded into RAM on the slaves (they are too large) but are heavily cached.

You can get some insights into which slaves are retrieving what data using roxie control queries such as control:indexmetrics.

Richard
richardkchapman
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 108
Joined: Fri Jun 17, 2011 8:59 am

Tue Mar 15, 2016 3:47 pm Change Time Zone

Thank you for your responses. It makes some things clear. But we are still unclear about a basic concept. Maybe we are asking a much simpler question than you think.

The data presented above was for Roxie cluster cyclic with number_of_data_copies = 2 and a 4-node Thor cluster with default config.

Figure 4 shows the data distribution on a 4-node Thor cluster. It has 2 replicas (and appears to be cyclic).

Table 1 shows the channel assignments (extracted from the logs) for our 4-node Roxie cluster. This is what we expected for cyclic with 2 copies.

Figure 1, which shows the index distribution of our Roxie cluster, has no replicas of the index parts. We didn't expect this because it is contrary to the config.

We do not understand the relationship between index parts and channel assignments. Specifically:

Q1: Are channels related to index parts? We thought that there was some correspondence between channels and index parts.

Q2: Consider n1, which has index part 1 and channels 1 and 4. What queries can the slave process on n1 service?

Really appreciate your time answering our questions.

Thanks,
Chin-Jung
chsu6
 
Posts: 8
Joined: Fri Feb 19, 2016 9:58 pm

Tue Mar 15, 2016 3:49 pm Change Time Zone

richardkchapman wrote:You can get some insights into which slaves are retrieving what data using roxie control queries such as control:indexmetrics.


BTW, the information obtained from that command seems identical to that from ECL watch.

Thanks,
Chin-Jung
chsu6
 
Posts: 8
Joined: Fri Feb 19, 2016 9:58 pm

Wed Mar 16, 2016 2:04 pm Change Time Zone

Hi Chin-Jung,

I am wondering if your configuration experiments might have changed the configuration to not create replication on your ROXIE. I just ran a test of my 2-node ROXIE training cluster and for a given index, this is what I see:
Image

Looking at my configuration file on each node, I see:

cyclicOffset="1"
numChannels="2"
numDataCopies="2"

So my cluster seems to be working as documented regarding replication. My server version is the latest 5.4.10-1

Regards,

Bob
Attachments
ROXIEIndexParts.jpg
ROXIEIndexParts.jpg (97.65 KiB) Viewed 949 times
bforeman
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 975
Joined: Wed Jun 29, 2011 7:13 pm

Sun Mar 20, 2016 8:55 pm Change Time Zone

Hi Bob and Richard,

I have the chance to rerun the test and do a fresh install on the our cluster. I carefully follow the instruction and now the replication mechanism works as expected, as shown in Figure A and Figure B. Thanks a lot.

I compare the new configuration (which works) and the old configuration. I found the major difference comes from channelsPerSlave in Thor. I attached the two configurations in another post due to attachment limitation.

Thanks,
Chin-Jung
Attachments
Screen Shot 2016-03-20 at 3.55.10 PM.png
Figure A: mode=cyclic, numDataCopies=2
Screen Shot 2016-03-20 at 3.55.10 PM.png (119.7 KiB) Viewed 914 times
Screen Shot 2016-03-20 at 3.59.59 PM.png
Figure 5: mode=cyclic, numDataCopies=5
Screen Shot 2016-03-20 at 3.59.59 PM.png (155.02 KiB) Viewed 914 times
chsu6
 
Posts: 8
Joined: Fri Feb 19, 2016 9:58 pm

Sun Mar 20, 2016 9:05 pm Change Time Zone

As mentioned before, here attached the two configuration files. The major difference is the channelsPerSlave parameter in Thor. Can this affect the replication behavior in Roxie?


Here is the diff result (diff new.xml old.xml):

2c2
< <!-- Edited with ConfigMgr on ip xx.xx.xx.xx on 2016-03-20T17:48:03 -->
---
> <!-- Edited with ConfigMgr on ip xx.xx.xx.xx on 2016-02-17T13:36:03 -->
218a219,222
> <AuthenticateFeature description="Access to ESDL configuration service"
> path="ESDLConfigAccess"
> resource="ESDLConfigAccess"
> service="ws_esdlconfig"/>
644a649,653
> <AuthenticateFeature authenticate="Yes"
> description="Access to ESDL configuration service"
> path="ESDLConfigAccess"
> resource="ESDLConfigAccess"
> service="ws_esdlconfig"/>
800a810,813
> <AuthenticateFeature description="Access to ESDL configuration service"
> path="ESDLConfigAccess"
> resource="ESDLConfigAccess"
> service="ws_esdlconfig"/>
989c1002
< slaveConfig="cyclic redundancy"
---
> slaveConfig="cyclic"
1105a1119
> channelsPerSlave="1"
1114a1129
> localThorPortInc="200"
1120a1136
> slaveport="20100"
1130,1131c1146
< <Storage/>
< <SwapNode/>
---
> <SwapNode AutoSwapNode="false"/>
Attachments
6_hpcc_cyclic_5.txt
Configuration: new and works
(47.91 KiB) Downloaded 156 times
hpcc2_cyclic.txt
Configuration: old and did not work
(48.73 KiB) Downloaded 155 times
chsu6
 
Posts: 8
Joined: Fri Feb 19, 2016 9:58 pm

Mon Mar 21, 2016 3:36 am Change Time Zone

A quick update. The parameter channelsPerSlave has nothing to do with the replication mechanism. I find out the following configuration can work. n is an integer greater than 0.

[Cyclic mode]
slaveConfig="cyclic redundancy"
numDataCopies=n

[Full mode]
slaveConfig="full redundancy"
numDataCopies=n

However, the overloaded configuration did not produce replication.

[Overloaded mode]
slaveConfig="overloaded"
channelsPerNode=n
numDataCopies=n
chsu6
 
Posts: 8
Joined: Fri Feb 19, 2016 9:58 pm


Return to Clustering

Who is online

Users browsing this forum: No registered users and 1 guest

cron