开发者

How to deal with unbalanced input of reduce task?

开发者 https://www.devze.com 2023-04-03 10:47 出处:网络
Recently I was asked how to deal with unbalanced input of reduce task. I thought for while and try to redistribute the data, but didn\'t come up with a goo开发者_运维知识库d solution. Any advice?Actua

Recently I was asked how to deal with unbalanced input of reduce task. I thought for while and try to redistribute the data, but didn't come up with a goo开发者_运维知识库d solution. Any advice?


Actually you have 2 ways.

  1. Increase the number of reduces, so your data could possibly better spread along the tasks
  2. Rewrite the partitioner to better distribute the keys over the tasks. [1]

[1] http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapreduce/Partitioner.html

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号