0

I have an array which represents some time series data:

array([[[-0.59776013],
    [-0.59776013],
    [-0.59776013],
    [-0.31863936],
    [-0.31863936],
    [-0.31863936],
    [-0.31863936],
    [-0.31863936],
    [-0.31863936],
    [ 0.31863936],
    [ 0.31863936],
    [ 0.31863936],
    [-0.31863936],
    [-0.31863936],
    [-0.31863936],
    [-0.31863936],
    [-0.31863936],
    [-0.31863936],
    [ 0.59776013],
    [ 0.59776013],
    [ 0.59776013],
    [ 0.93458929],
    [ 0.93458929],
    [ 0.93458929],
    [-0.31863936],
    [-0.31863936],
    [-0.31863936],
    [-0.31863936],
    [-0.31863936],
    [-0.31863936],
    [-0.06270678],
    [-0.06270678],
    [-0.06270678],
    [-0.06270678],
    [-0.06270678],
    [-0.06270678],
    [-0.31863936],
    [-0.31863936],
    [-0.31863936],
    [-0.31863936],
    [-0.31863936],
    [-0.31863936],
    [ 0.75541503],
    [ 0.75541503],
    [ 0.75541503],
    [ 0.93458929],
    [ 0.93458929],
    [ 0.93458929],
    [-0.31863936],
    [-0.31863936],
    [-0.31863936],
    [ 0.75541503],
    [ 0.75541503],
    [ 0.75541503],
    [-0.31863936],
    [-0.31863936],
    [-0.31863936],
    [ 0.31863936],
    [ 0.31863936],
    [ 0.31863936],
    [ 0.        ],
    [ 0.        ],
    [ 0.        ],
    [ 0.        ],
    [ 0.        ],
    [ 0.        ],
    [ 0.        ],
    [ 0.        ]]])

The unique values in this array are:

np.unique(sax_dataset_inv)
array([-0.59776013, -0.31863936, -0.06270678,  0.        ,  0.31863936,
    0.59776013,  0.75541503,  0.93458929])

My task

Assign either 'F' for fast, 'S' for slow or 'M' for medium to a given array value.

My attempt

I can do it for 2 assignments, 'F' or 'S':

sax_list = ['F' if element < 0 else 'S' for element in list(sax_dataset_inv.flatten())]

However I cannot understand how I can do the same expression above for 3 different labels.

Desired Output

Take an example array of [-3-2-1,0,1,2,3,4,5,6]

The values -3 to -1 inclusive should be assigned 'F'. Values 0 to 3 inclusive should be assigned 'M'. Values greater than 3 should be assigned 'S'.

Mazz
  • 568
  • 2
  • 7
  • 18
  • Helpful: [ternary operator in Python](https://stackoverflow.com/questions/394809/does-python-have-a-ternary-conditional-operator) – blacksite Feb 20 '19 at 13:19
  • also what to do with values greater than -1 and smaller than 0 ? –  Feb 20 '19 at 13:21
  • 1
    Have you considered initializing your resulting array with 'M' and then checking if the entry is bigger/smaller than your bounds? – DocDriven Feb 20 '19 at 13:23
  • My attempt has been highlighted above. I edited the question to show that I can do the task for two labels. If element<0 assign 'F', else assign 'S'. Does If element<0 assign 'F', if element in range(0,0.4)assign 'M', else assign 'M' make sense? – Mazz Feb 20 '19 at 14:07

4 Answers4

2

You can do this:

arr = np.array([-3-2-1,0,1,2,3,4,5,6])
new_arr = np.zeros(shape = arr.shape, dtype=np.str)

new_arr[(arr>3)] = 'S'
new_arr[((arr>=-3) & (arr<=-1))] = 'F'
new_arr[((arr>=0)&(arr<=3))] = 'M'
new_arr
array(['', 'M', 'M', 'M', 'M', 'S', 'S', 'S'], dtype='<U1')

The values that don't match your condition will remain empty strings.

Also you can use numpy.empty to initialize an empty array:

new_arr = np.empty(shape = arr.shape, dtype=np.str)
Mohit Motwani
  • 4,072
  • 3
  • 15
  • 38
2

Use numpy.select for vectorized solution:

new_arr = np.select([arr>3, (arr>=-3) & (arr<=-1), (arr>=0)&(arr<=3)],
                    ['S','F','M'], 
                    default='')
print (new_arr)

['F' 'F' 'F' 'M' 'M' 'M' 'M' 'S' 'S' 'S']

Performance:

arr = np.array([-3,-2,-1,0,1,2,3,4,5,6] * 1000)

my_list = [-3,-2,-1,0,1,2,3,4,5,6] * 1000

In [276]: %timeit my_list_mapping = ['F' if ((i >= -3) & (i <= -1)) else 'M' if ((i >= 0) & (i <= 3)) else 'S' for i in my_list]
1.14 ms ± 67.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [277]: %timeit np.select([arr>3, (arr>=-3) & (arr<=-1), (arr>=0)&(arr<=3)],['S','F','M'],  default='')
172 µs ± 7.35 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
jezrael
  • 629,482
  • 62
  • 918
  • 895
1

Note: OP's intervals are [-3,-1], [0,3] & (3,..), so I am only assuming integral values. The conditions can be altered accordingly, but the design remains.

Using list comprehensions for if-elif-else:

my_list = [-3,-2,-1,0,1,2,3,4,5,6]

my_list_mapping = ['F' if ((i >= -3) & (i <= -1)) else 'M' if ((i >= 0) & (i <= 3)) else 'S' for i in my_list]
print(my_list_mapping)
    ['F', 'F', 'F', 'M', 'M', 'M', 'M', 'S', 'S', 'S']
cph_sto
  • 5,368
  • 6
  • 31
  • 55
1

You can neatly achieve your result by using the map function from Python core library Let's say your mapping function would look something like this:

    def mapping_function(value):
        if value >= -3 and value <= -1:
            return 'F'
        elif value >= 0 and value <= 3:
            return 'M'
        elif value > 3:
            return 'S'
        else:
            return 'U'  # eg. as undefined
    my_array = [-3, -2, -1, 0, 1, 2, 3, 4, 5, 6]
    mapped_vals = map(mapping_function, my_array)
    # if you want a list
    my_mapped_list = list(mapped_vals)

I think it is a readable and clear solution.

Since you have tagged it as a pandas question, take a look at pandas.Series.apply function from their docs. Instead of using it as map, you pass the mapping function as an argument to the apply function.

LemurPwned
  • 410
  • 7
  • 12