开发者

hive Expression Not In Group By Key

开发者 https://www.devze.com 2023-02-28 11:47 出处:网络
I create a t开发者_开发技巧able in HIVE. It has the following columns: id bigint, rank bigint, date string

I create a t开发者_开发技巧able in HIVE. It has the following columns:

id bigint, rank bigint, date string

I want to get avg(rank) per month. I can use this command. It works.

select a.lens_id, avg(a.rank)
from tableA a
group by a.lens_id, year(a.date_saved), month(a.date_saved); 

However, I also want to get date information. I use this command:

select a.lens_id, avg(a.rank), a.date_saved
from lensrank_archive a
group by a.lens_id, year(a.date_saved), month(a.date_saved);

It complains: Expression Not In Group By Key


The full error message should be in the format Expression Not In Group By Key [value].
The [value] will tell you what expression needs to be in the Group By.

Just looking at the two queries, I'd say that you need to add a.date_saved explicitly to the Group By.


A walk around is to put the additional field in a collect_set and return the first element of the set. For example

select a.lens_id, avg(a.rank), collect_set(a.date_saved)[0]
from lensrank_archive a
group by a.lens_id, year(a.date_saved), month(a.date_saved);


This is because there is more than one ‘date_saved’ record under your group by. You can turn these ‘date_saved’ records into arrays and output them.

0

精彩评论

暂无评论...
验证码 换一张
取 消