I'm quite new to decorators and classes in general on Python, but have a question if there is a better way to decorate pandas objects. An an example, I have written the following to create two methods -- lisa and wil:
import numpy as np
import pandas as pd
test = np.array([['john', 'meg', 2.23, 6.49],
['lisa', 'wil', 9.67, 8.87],
['lisa', 'fay', 3.41, 5.04],
['lisa', 'wil', 0.58, 6.12],
['john', 'wil', 7.31, 1.74]],
)
test = pd.DataFrame(test)
test.columns = ['name1','name2','scoreA','scoreB']
@pd.api.extensions.register_dataframe_accessor('abc')
class ABCDataFrame:
def __init__(self, pandas_obj):
self._obj = pandas_obj
@property
def lisa(self):
return self._obj.loc[self._obj['name1'] == 'lisa']
@property
def wil(self):
return self._obj.loc[self._obj['name2'] == 'wil']
Example output is as follows:
test.abc.lisa.abc.wil
name1 name2 scoreA scoreB
1 lisa wil 9.67 8.87
3 lisa wil 0.58 6.12
I have two questions.
First, in practice, I am creating much more than two methods, and need to call many of them in the same line. Is there a way to get test.lisa.wil
to return the same output as above where I wrote test.abc.lisa.abc.wil
, since the former will save me from having to type the abc
each time?
Second, if there are any other suggestions/resources on decorating pandas DataFrames, please let me know.