Tue Oct 19, 2021 7:54 am
Login Register Lost Password? Contact Us


Hadoop/Cloudera and HPCC coexisting

Topics related to the Hadoop Connector or migrating data from Hadoop

Fri Feb 24, 2012 2:48 pm Change Time Zone

Does anyone have any experience with installing both HPCC and Cloudera's Hadoop distro on the same systems and running them simultaneously? Any conflicts or gotchas?

Cheers,

Dan
DSC
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 568
Joined: Tue Oct 18, 2011 4:45 pm

Fri Feb 24, 2012 6:18 pm Change Time Zone

I have seen it done here, but that was POC code I was viewing at the time. Whether you would want to do it in a production system is a question someone else will need to answer.
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1598
Joined: Wed Oct 26, 2011 7:40 pm

Fri Feb 24, 2012 6:24 pm Change Time Zone

Let me clarify a bit.

This is purely for testing. We are still evaluating big data solutions and it makes a great deal of sense to normalize as much of the environment as possible when comparing competing products. It struck me that actually using the same nodes would be ideal. What would then make it *easy* would be the case where neither HPCC nor Hadoop conflict with each other; that way, we don't have to explicitly shutdown or remove one product to test the other.

So, I'm hoping that it's possible to make them coexist. You're telling me they will, right? No conflicts at runtime with things like TCP/IP ports or other system resources?
DSC
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 568
Joined: Tue Oct 18, 2011 4:45 pm

Fri Feb 24, 2012 7:14 pm Change Time Zone

I'm just saying I've SEEN it -- I was not the guy doing it so I have no idea what chuckholes and pitfalls there may be -- sorry. :(
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1598
Joined: Wed Oct 26, 2011 7:40 pm

Fri Feb 24, 2012 7:32 pm Change Time Zone

Dan,

We have a test cluster internally, where we run both, HPCC and Hadoop, alternatively, to benchmark them. You wouldn't need to remove one in order to run the other, but you wouldn't probably want both running at the same time while testing, either.

Keep in mind that, for example, the jvm can hold a substantial amount of memory, even while Hadoop seems not to be doing much (or anything at all).

You want to also verify, in between runs, that you don't have significant paging (and you may also want to flush out filesystem cache, for fairness).

And you shouldn't forget to configure your memory allocation settings (for both, HPCC and Hadoop), to values that effectively utilize the available hardware.

Flavio
flavio
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 73
Joined: Wed Apr 27, 2011 8:59 pm

Fri Feb 24, 2012 7:36 pm Change Time Zone

That is exactly the information I was looking for. Thanks, guys!

Dan
DSC
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 568
Joined: Tue Oct 18, 2011 4:45 pm


Return to From Hadoop

Who is online

Users browsing this forum: No registered users and 1 guest

cron