开发者

How to calculate the columnwise minimum of a dask pivot table?

开发者 https://www.devze.com 2022-12-07 22:36 出处:网络
I would like to create a pivot table in dask and then calculate the column wise minimum. import dask.dataframe as dd

I would like to create a pivot table in dask and then calculate the column wise minimum.

import dask.dataframe as dd
from dask.distributed import Client

client = Client()

df = dd.read_csv("data.csv")

# In order to use pivot_table, the columns use as index and columns need to be categorical:
df = df.categorize(columns=['A', 'B'])

#df['A'] = df['A'].cat.开发者_开发知识库as_ordered()
#df['B'] = df['B'].cat.as_ordered()

pt = df.pivot_table(index='A', columns='B', values='C', aggfunc='mean')

pt.min().compute()

TypeError: Categorical is not ordered for operation min you can use .as_ordered() to change the Categorical to an ordered one

...

# Trying to uncategorize the index, takes forever
pt.index = list(pt.index)
pt.min().compute()

Is there a better way to archive this?

0

精彩评论

暂无评论...
验证码 换一张
取 消