Pandas 目前支持三种多轴索引
.loc 主要基于标签,但也可以用于布尔数组。在 .loc 没有找到items时,会产生KeyError。
最基础的索引:
import pandas as pdimport numpy as npdates = pd.date_range('1/1/2000', periods=8)df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])print(df)s = df['A']print(s[dates[5]])
使用 .loc 索引(按标签选择)
import pandas as pdimport numpy as npdates = pd.date_range('1/1/2000', periods=8)df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])print(df)print()print(df.loc[:,['A', 'C']])print(df.loc['20000101':'20000104',])
使用.iloc索引(按位置选择)
import pandas as pdimport numpy as npimport matplotlib.pyplot as plt# s = pd.Series([1, 3, 4, np.nan, 6, 8])dates = pd.date_range('20130101', periods=6)df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))df2 = pd.DataFrame({ 'A':1., 'B': pd.Timestamp('20130102'), 'C': pd.Series(1, index=list(range(4)), dtype='float32'), 'D': np.array([3] * 4, dtype='int32'), 'E': pd.Categorical(["test", "train", "test", "train"]), 'F': 'foo'})print(df.iloc[3])print(df.iloc[[1, 2, 4], [0, 2]])
使用.isin()方法进行过滤
import pandas as pdimport numpy as npimport matplotlib.pyplot as plt# s = pd.Series([1, 3, 4, np.nan, 6, 8])dates = pd.date_range('20130101', periods=6)df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))df2 = df.copy();df2['E'] = ['one', 'one', 'two', 'three', 'four', 'three']print(df2)print(df2[df2['E'].isin(['two', 'four'])])
Indexers | |