开发者

How to process lines in a file in specific hadoop slave?

开发者 https://www.devze.com 2023-02-01 14:54 出处:网络
We have a custom input format extending the FileInputFormat, which generates a separate split for each line in the input file. This file provides a host name in which the mapper handling this line sho

We have a custom input format extending the FileInputFormat, which generates a separate split for each line in the input file. This file provides a host name in which the mapper handling this line should run.

How do I achieve this?

This is needed as the mapper reads data from DB a开发者_StackOverflownd I want to run the mapper in the same machine as the DB server.


Not possible without writing your own implementation within the Hadoop code base.

If you are trying to add more data to the map input then pass it in as an argument to the job and you can then have it in your map() and concatenate it with the input.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号