Pandas系列教程(8)pandas数据排序


pandas数据排序

1. Series的排序:

Series.sort_values(ascending=True, inplace=Flase)

参数说明:

    1. ascending:默认为True升序排列,为Flase降序排序

    2. inplace: 是否修改原始的Series

2. DataFrame的排序

DataFrame.sort_values(by, ascending=True, inplace=Flase )

参数说明:

    1. by: 字符串或者List<字符串>,单列排序或者多列排序

    2. ascending:bool或者list,升序还是降序,如果是list对应by的多列

    3. inplace:是否修改原始的DataFrame

1、读取数据

import pandas as pd

file_path = "../../datas/files/beijing_tianqi_2018.csv"
df = pd.read_csv(file_path)

# 替换温度的后缀℃, 并转为int32(修改列)
df.loc[:, 'bWendu'] = df.loc[:, 'bWendu'].str.replace('', '').astype('int32')
df.loc[:, 'yWendu'] = df.loc[:, 'yWendu'].str.replace('', '').astype('int32')

print(df.head(3))

2、Series排序

import pandas as pd

file_path = "../../datas/files/beijing_tianqi_2018.csv"
df = pd.read_csv(file_path)

df.loc[:, 'bWendu'] = df.loc[:, 'bWendu'].str.replace('', '').astype('int32')
df.loc[:, 'yWendu'] = df.loc[:, 'yWendu'].str.replace('', '').astype('int32')

print('*' * 25, '打印前几行数据', '*' * 25)
print(df.head())

# -------------------- series排序 --------------------- #
print('*' * 25, 'aqi升序', '*' * 25)
print(df['aqi'].sort_values())

print('*' * 25, 'aqi降序', '*' * 25)
print(df['aqi'].sort_values(ascending=False))

print('*' * 25, 'tianqi中文排列', '*' * 25)
print(df['tianqi'].sort_values())

3、DataFrame排序

单列排序

import pandas as pd

file_path = "../../datas/files/beijing_tianqi_2018.csv"
df = pd.read_csv(file_path)

df.loc[:, 'bWendu'] = df.loc[:, 'bWendu'].str.replace('', '').astype('int32')
df.loc[:, 'yWendu'] = df.loc[:, 'yWendu'].str.replace('', '').astype('int32')

print('*' * 25, '打印前几行数据', '*' * 25)
print(df.head())

# ---------------------- DataFrame排序 ----------------------- #
# 单列排序
print('*' * 25, 'aqi升序', '*' * 25)
print(df.sort_values(by='aqi'))

print('*' * 25, 'aqi降序', '*' * 25)
print(df.sort_values(by='aqi', ascending=False))

多列排序

import pandas as pd

file_path = "../../datas/files/beijing_tianqi_2018.csv"
df = pd.read_csv(file_path)

df.loc[:, 'bWendu'] = df.loc[:, 'bWendu'].str.replace('', '').astype('int32')
df.loc[:, 'yWendu'] = df.loc[:, 'yWendu'].str.replace('', '').astype('int32')

print('*' * 25, '打印前几行数据', '*' * 25)
print(df.head())

# ---------------------- DataFrame排序 ----------------------- #
# 多列排序
print('*' * 25, '按空气质量等级,最高温度排序,默认升序', '*' * 25)
print(df.sort_values(by=['aqiLevel', 'bWendu']))

print('*' * 25, '按空气质量等级,最高温度排序,指定降序', '*' * 25)
print(df.sort_values(by=['aqiLevel', 'bWendu'], ascending=False))

print('*' * 25, '分别指定升序和降序', '*' * 25)
print(df.sort_values(by=['aqiLevel', 'bWendu'], ascending=[True, False]))