Wed Aug 15, 2018 3:09 pm
Login Register Lost Password? Contact Us


Reducing Skew

Post questions or comments on how best to manage your big data problem

Thu Dec 20, 2012 10:30 pm Change Time Zone

Hello,

I'm facing an issue wherein my data file is extremely skewed (+2900%, -100%)across slaves.

This is in spite of doing a hash32 distribute on two fields (one of which admittedly has lots of 0s, while the other is a mostly unique 38 digit integer).

Is there anything that I can do to reduce this skew? I could:
1. Exclude the field with 0s from the hash key
2. Use some other hashing function

Will be happy to provide whatever other information you need.
sban
 
Posts: 3
Joined: Fri Dec 14, 2012 7:08 am

Fri Dec 21, 2012 6:54 pm Change Time Zone

sban,

I would think your #1 would be your best/simplest/easiest option.

HTH,

Richard
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1368
Joined: Wed Oct 26, 2011 7:40 pm


Return to Managing Big Data

Who is online

Users browsing this forum: No registered users and 1 guest