consider the following code:
class MyClass(object):
def __init__(self):
self.data_a = np.array(range(100))
self.data_b = np.array(range(100,200))
self.data_c = np.array(range(200,300))
def _method_i_do_not_have_access_to(self, data, window, func):
output = np.empty(np.size(data))
for i in xrange(0, len(data)-window+1):
output[i] = func(data[i:i+window])
output[-window+1:] = np.nan
return output
def apply_a(self):
a = self.data_a
def _my_func(val):
return sum(val)
return self._method_i_do_not_have_access_to(a, 5, _my_func)
my_class = MyClass()
print my_class.apply_a()
The _method_i_do_not_have_access_to
method takes a numpy array, a window parameter, and a user-defined function handle and returns an array containing values output by the function handle on window
data points at a time of the input data array - a generic rolling method. I do not have access to changing this method.
As you can see, _method_i_do_not_have_access_to
passes one input to the function handle which is the data array passed to _method_i_do_not_have_access_to
. That function handle only computes output based window
data points on the one data array passed to it through _method_i_do_not_have_access_to
.
What I need to do is allow _my_func
(the function handle passed to _method_i_do_not_have_access_to
) to operate on data_b
and data_c
in addition to the array that is passed to _my_func
through _method_i_do_not_have_access_to
at the same window
indexes. data_b
and data_c
are defined globally in the MyClass class
.
The only way I have thought of doing this is including references to data_b
and data_c
within _my_func
like this:
def _my_func(val):
b = self.data_b
c = self.data_c
# do some calculations
return sum(val)
However, I need to slice b
and c
at the same indexes as val
(remember val
is the length-window
slice of the array that is passed through _method_i_do_not_have_access_to
).
For example, if the loop within _method_i_do_not_have_access_to
is currently operating on indexes 45 -> 50
on the input array, _my_func
has to be operating on the same indexes on b
and c
.
The final result would be something like this:
def _my_func(val):
b = self.data_b # somehow identify which slide we are at
c = self.data_c # som开发者_运维技巧ehow identify which slide we are at
# if _method_i_do_not_have_access_to is currently
# operating on indexes 45->50, then the sum of
# val, b, and c should be the sum of the values at
# index 45->50 at each
return sum(val) * sum(b) + sum(c)
Any thoughts on how I might accomplish this?
The question is how would _my_func know on which indizes to operate? If you know the indizes in advance when calling your function, the simplest approach would be just using a lambda: lambda val: self._my_func(self.a, self.b, index, val)
with _my_func obviously changed to accommodate the additional parameters.
Since you don't know the indizes, you'll have to write a wrapper around self.c that remembers which index was last accessed (or better yet catches the slice operator) and stores this in a variable for your function to use..
Edit: Knocked up a small example, not especially great coding style and all, but should give you the idea:
class Foo():
def __init__(self, data1, data2):
self.data1 = data1
self.data2 = data2
self.key = 0
def getData(self):
return Foo.Wrapper(self, self.data2)
def getKey(self):
return self.key
class Wrapper():
def __init__(self, outer, data):
self.outer = outer
self.data = data
def __getitem__(self, key):
self.outer.key = key
return self.data[key]
if __name__ == '__main__':
data1 = [10, 20, 30, 40]
data2 = [100, 200, 300, 400]
foo = Foo(data1, data2)
wrapped_data2 = foo.getData()
print(wrapped_data2[2:4])
print(data1[foo.getKey()])
you can pass a two dimension array to _method_i_do_not_have_access_to(). len() and slice operation will work with it:
In [29]: a = np.arange(100)
In [30]: b = np.arange(100,200)
In [31]: c = np.arange(200,300)
In [32]: data = np.c_[a,b,c] # make your three one dimension array to one two dimension array.
In [35]: data[0:10] # slice operation works.
Out[35]:
array([[ 0, 100, 200],
[ 1, 101, 201],
[ 2, 102, 202],
[ 3, 103, 203],
[ 4, 104, 204],
[ 5, 105, 205],
[ 6, 106, 206],
[ 7, 107, 207],
[ 8, 108, 208],
[ 9, 109, 209]])
In [36]: len(data) # len() works.
Out[36]: 100
In [37]: data.shape
Out[37]: (100, 3)
so you can define your _my_func as follows:
def _my_func(val):
s = np.sum(val, axis=0)
return s[0]*s[1] + s[2]
Since it appears that _method_i_do_not..
is simply applying your function to your data, could you have the data be precisely an array of indices? Then func
would use the indices for windowed access to data_a
, data_b
, and data_c
. There might be faster ways, but I think this would work with a minimum of added complexity.
So in other words, something roughly like this, with additional processing on window
added if necessary:
def apply_a(self):
a = self.data_a
b = self.data_b
c = self.data_c
def _my_func(window):
return sum(a[window]) * sum(b[window]) + sum(c[window])
return self._method_i_do_not_have_access_to(window_indices, 5, _my_func)
Here's a hack:
Make a new class DataProxy
that has a __getitem__
method, and proxies the three data arrays (which you can pass to it e.g. on initialisation). Make func act on
DataProxy instances instead of standard numpy arrays, and pass the modified func and the proxy in to the inaccessible method.
Does that make sense? The idea is that there's no constraint on data
to be an array, just to be subscriptable. So you can make a custom subscriptable class to use instead of an array.
Example:
class DataProxy:
def __init__(self, *data):
self.data = list(zip(*data))
def __getitem__(self, item):
return self.data[item]
Then create a new DataProxy, passing in as many arrays as you want when you do so, and make func accept the results of indexing said instance. Try it!
精彩评论