Thu Oct 28, 2021 3:27 pm
Login Register Lost Password? Contact Us


min/max skews in Graphs view

Topics specific to the use of the ECL IDE

Fri Apr 13, 2012 4:18 am Change Time Zone

I noticed that when Thor runs workunits against relatively small datasets ( 100s of MB ), there are sometimes relatively large min/max skews displayed in the IDE graph... in particular I'm seeing a -100% min skew for a node that I know has relevant data on it. I can't find much in the way of documentation on what those skews mean. Are the skews similar to DISTRIBUTE, measuring the relative difference in data processed on each node for that particular step? And when I see a -100% for a given node, is that because Dali realized that it would be faster to just use the mirrored copies for some nodes so that fewer nodes had to be queried overall?
jeremy
 
Posts: 27
Joined: Fri Mar 09, 2012 3:16 pm

Fri Apr 13, 2012 4:27 am Change Time Zone

Interestingly enough, the graph view available via the ECL Watch for the same completed workunit shows different ( and much saner ) skews, so maybe the issue is with the IDE's representation?
jeremy
 
Posts: 27
Joined: Fri Mar 09, 2012 3:16 pm

Fri Apr 13, 2012 8:55 am Change Time Zone

Both the IDE and ECL Watch use the same data to visualize the graph so both should be the same. The only explanation for the difference is that one was for the completed graph and the other was for a running graph? If you open the completed WU in the IDE and look at the graph does it now match the ECL Watch one?

Gordon.
gsmith
 
Posts: 290
Joined: Thu May 12, 2011 9:40 am

Fri Apr 13, 2012 2:05 pm Change Time Zone

So I checked that last night and the results were still different... I just repeated the same steps:
1. Run a workunit in the IDE, wait for completion.
2. Refresh the graph layout in the IDE.
3. Compare to the graph layout on ECL Watch.
Here are the two snapshots... both say "completed".

ECL IDE:
ecl-ide.png
(47.9 KiB) Downloaded 2313 times


ECL Watch:
ecl-watch.png
(35.39 KiB) Downloaded 2313 times
jeremy
 
Posts: 27
Joined: Fri Mar 09, 2012 3:16 pm

Fri Apr 13, 2012 2:10 pm Change Time Zone

To further test:
1. In the IDE, I closed the local workunit and then opened it again from the workunit browser... same issue.
2. I shutdown the IDE, then started it up again, loaded the workunit... and now the results match ECL Watch...

Perhaps some type of cache issue?
jeremy
 
Posts: 27
Joined: Fri Mar 09, 2012 3:16 pm

Mon Apr 16, 2012 9:33 am Change Time Zone

Sounds like an IDE issue. I suspect the IDE sees the “completed” message before getting the last refresh of graph data. I will open an issue on github.

Gordon.
gsmith
 
Posts: 290
Joined: Thu May 12, 2011 9:40 am

Sat Apr 28, 2012 4:04 am Change Time Zone

First, let me say that "If you're not viewing your graphs, you're not developing in ECL." Part of maximizing the information provided in the graphs is understanding the skew metrics.

I've never actually seen it documented anywhere, but the max/min skew values are a percentage difference (+ for max, - for min) between the perfect distribution count of records and the actual count of records for the most extreme node in each category.

perfect distribution count per node = (total records / # of thor nodes)

maxskew = (highest record count for a single node - (total records / # of thor nodes)) / (total records / # of thor nodes)) * 100 and displayed as a %

minskew = (1 - (lowest record count for a single node / (total records / # of thor nodes))) * 100 and displayed as a %

So, if there are 800,000 records in a recordset being processed by a 10 way thor, the perfect distribution would have 80,000 records on each node. If the maximum number of records on a single node is 320,000 and minimum is 20,000, the maxskew is +300% and the minskew is -75%.

The maximum maxskew is controlled by the number of nodes. A 10 way thor has a max maxskew of +900% if all the data (9x what is expected) is on a single node. A 100 way thor was a max maxskew of +9900%. A 400-way +39900%, etc.

The maximum (min?, most exteme) min skew is -100%.

Other notes on skew:

When watching the graph for a running workunit, the skew numbers will update as the data is processed through activities in the subgraphs (within the graph within the wokrunit). Something like a COUNT PROJECT that executes as a globally sequencial activity will show a -100% minskew until it begins work on the last thor node. Some activities that redistribute the data, like a global SORT, cannot report skew while executing because the skew is not known for an individual node until the activity completes.

If an activity has values of +0% and -0%, the skew values are not displayed.

As, you have seen, sometimes when watching a graph run, the last update to the graph sometimes is missed by the refresh logic, so the final values may display incorrectly.
joecella
 
Posts: 5
Joined: Fri Jun 17, 2011 4:28 am

Tue May 01, 2012 2:22 pm Change Time Zone

So... skew numbers closer to zero are better during a running graph?

What kinds of things should we look at -- or better yet, do -- if we see outrageous skew numbers?
DSC
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 568
Joined: Tue Oct 18, 2011 4:45 pm

Wed May 02, 2012 2:31 pm Change Time Zone

I've never actually seen it documented anywhere, but the max/min skew values are a percentage difference (+ for max, - for min) between the perfect distribution count of records and the actual count of records for the most extreme node in each category.
This is one of the things we talk about in our Intro classes when we go over what the graphs represent and how they can be used. My usual example is a 3-node Thor, 300 records in a dataset, and skew percentages of (+200%, -100%) -- how many records are on each node? And the answer is: two of the nodes have no records and the other has all 300.
rtaylor
Community Advisory Board Member
Community Advisory Board Member
 
Posts: 1601
Joined: Wed Oct 26, 2011 7:40 pm


Return to ECL IDE

Who is online

Users browsing this forum: No registered users and 1 guest

cron