Sat Aug 18, 2018 11:53 pm
Login Register Lost Password? Contact Us


My Roxie is not starting in cluster environment.

Topics related to recommendations or questions on the design for HPCC Systems clusters

Wed Feb 11, 2015 7:11 pm Change Time Zone

Hi,
In a 4 node cluster, myroxie service is not starting up even after restarting the entire HPCC service 3 times. The cluster configuration is
number of support nodes: 1
number of roxie nodes: 2
number of thor nodes: 2
number of thor slaves per node: 1

Below are the services running at each node.
X.X.X.221 hpcc-init status :
mydafilesrv ( pid 2451 ) is running...

X.X.X.241 hpcc-init status :
mydafilesrv ( pid 3323 ) is running...
myroxie is stopped

X.X.X.139 hpcc-init status :
mydafilesrv ( pid 3424 ) is running...
mydali ( pid 12375 ) is running...
mydfuserver ( pid 12479 ) is running...
myeclagent ( pid 12590 ) is running...
myeclccserver ( pid 12686 ) is running...
myeclscheduler ( pid 12787 ) is running...
myesp ( pid 12886 ) is running...
mysasha ( pid 12988 ) is running...
mythor ( pid 13226 ) is running...

X.X.X.14 hpcc-init status :
mydafilesrv ( pid 2455 ) is running...
myroxie is stopped

I have attached the environment.xml and log files for reference.

Thanks,
Lakshman
Attachments
thormaster.2015_02_11.log
Thormaster Log
(29.33 KiB) Downloaded 223 times
roxie.log_X.X.X.14.txt
Roxie log at X.X.X.14
(35.55 KiB) Downloaded 223 times
environment.txt
environment.xml file
(44.45 KiB) Downloaded 233 times
lakshmannaresh
 
Posts: 15
Joined: Tue Feb 03, 2015 5:20 am

Wed Feb 11, 2015 7:12 pm Change Time Zone

Remaining log files.
Attachments
roxie.2015_02_11.log_X.X.X.241.txt
Roxie log at X.X.X.241
(35.45 KiB) Downloaded 224 times
roxie.2015_02_11.log_X.X.X.14.txt
Roxie log at X.X.X.14
(35.55 KiB) Downloaded 229 times
roxie.log_X.X.X.241.txt
Roxie log at X.X.X.241
(35.45 KiB) Downloaded 228 times
lakshmannaresh
 
Posts: 15
Joined: Tue Feb 03, 2015 5:20 am

Thu Feb 12, 2015 2:16 pm Change Time Zone

Roxie is not starting due to the following error
00000037 2015-02-11 13:35:32.959 2546 2546 "Loading empty package for QuerySet roxie"
00000038 2015-02-11 13:35:32.962 2546 2546 "/proc/sys/net/core/rmem_max value 124928 is less than 131071"
00000039 2015-02-11 13:35:32.963 2546 2568 "AutoReloadThread 0xc13548 starting"
0000003A 2015-02-11 13:35:32.963 2546 2546 "EXCEPTION: (1455): System socket max read buffer is less than 131071"

Please modify the rmem_max value and retry
sort
 
Posts: 59
Joined: Wed May 11, 2011 1:54 pm

Thu Feb 12, 2015 2:25 pm Change Time Zone

Hi,

From the roxie log file is:

00000038 2015-02-11 13:42:58.578 6441 6441 "/proc/sys/net/core/rmem_max value 124928 is less than 131071"
00000039 2015-02-11 13:42:58.578 6441 6441 "EXCEPTION: (1455): System socket max read buffer is less than 131071"

The socket buffer size is set to 128kb and the kernel needs to have at least this much as well (ideally should have more than this).

Can you increase your kernel socket buffer sizes with some settings in /etc/sysctl.conf. You can see what these are all set to with:

sudo sysctl -a | grep core | grep mem_max

You need to increase:

net.core.wmem_max
net.core.rmem_max

To be > 128kb. I would suggest 256kb (262144).

I have used much larger settings for those and these below:

net.core.wmem_default
net.core.rmem_default
net.core.optmem_max
net.ipv4.tcp_mem
net.ipv4.tcp_wmem
net.ipv4.tcp_rmem
net.ipv4.udp_mem

But it appears your system does not have much physical memory (only 1 GB - is that right ??) and so you may not want to increase these settings (other than the required net.core.[r,w]mem_max mentioned above).

On that note, roxie memory is set to 1 GB -

00000002 2015-02-11 12:24:15.691 3555 3555 "RoxieMemMgr: Setting memory limit to 1073741824 bytes (1024 pages)"

which suggests you really want more physical memory.

thanks,
mark
mkellyhpcc
 
Posts: 15
Joined: Mon Mar 10, 2014 2:51 pm

Mon Feb 16, 2015 4:13 pm Change Time Zone

Hi Mark,
After increasing the physical memory and net.core.wmem_max/net.core.rmem_max values, the roxie service started running in the cluster environment. Thanks for the help.

-Lakshman Naresh.
lakshmannaresh
 
Posts: 15
Joined: Tue Feb 03, 2015 5:20 am


Return to Clustering

Who is online

Users browsing this forum: No registered users and 1 guest

cron