1

I have a Pandas Series, s, and spliced it::

print(s)
A            {B, A}
B     {B,  A   , E}
C          {B,  C}
D            {D, A}
E        {B, E,  C}
dtype: object

f = s.index
p = s.values

f is now a Pandas Index; p is a numpy array. I then strip the whitespaces.

I now want to 'cross-check', see which letters are in each row and column::

cross_check = (p[:, None] & [{x} for x in f]).astype(bool)
print(cross_check)

array([[ True,  True, False, False, False],
       [ True,  True, False, False,  True],
       [False,  True,  True, False, False],
       [ True, False, False,  True, False],
       [False,  True,  True, False,  True]], dtype=bool)

This is great, but fails if the case doesn't match i.e. "B" is 'b' in the first row.

How do I perform the logic and be case-insensitive?? Thanks!!

npross
  • 1,348
  • 2
  • 15
  • 30

1 Answers1

1

You can use list comprehension for convert sets to upper with strip:

s = pd.Series([set(['B','A']), 
               set(['B', ' a   ', 'E']),
               set(['B','  C']),    
               set(['d','A']),
               set(['B','E', '  c'])], index=list('aBCDE'))
print (s)
a           {B, A}
B    {B, E,  a   }
C         {  C, B}
D           {d, A}
E      {  c, B, E}

f = s.index.str.upper().str.strip()
p = np.array([set([x.upper().strip() for x in item]) for item in s.values])
print (p)
[{'B', 'A'} {'B', 'E', 'A'} {'B', 'C'} {'D', 'A'} {'B', 'E', 'C'}]

cross_check = (p[:, None] & [{x} for x in f]).astype(bool)
print (cross_check)

[[ True  True False False False]
 [ True  True False False  True]
 [False  True  True False False]
 [ True False False  True False]
 [False  True  True False  True]]

For me Zero solution working nice too:

p = s.apply(lambda x: {v.strip().upper() for v in x})
print (p)
A       {B, A}
B    {B, E, A}
C       {B, C}
D       {D, A}
E    {B, E, C}
dtype: object
jezrael
  • 629,482
  • 62
  • 918
  • 895
  • Well, this likely works and *does* address the case-sensitive issue. But it's not really very generalizable and doesn't address the particular instance, given above. – npross Oct 10 '17 at 10:17
  • So, just been testing here. So, technically the above answer, with "v.strip().upper() " works for this example. However, if you change the very first "A" to a lower-case "a" (i.e. modify s.index), then strip().upper() doesn't work. – npross Oct 10 '17 at 10:56
  • I guess I was after a boolean logic operate where 'A' & 'a' evaluate to TRUE (rather than manipulating the strings). – npross Oct 10 '17 at 10:57
  • Solution is convert values of index to upper, and if necessary strip. Solution was modifying. – jezrael Oct 10 '17 at 10:59
  • Sure, but the question was asking about case insensitive logic. – npross Oct 10 '17 at 11:01
  • Yes, but I think here it is not possible. Need modify strings :( – jezrael Oct 10 '17 at 11:03
  • I think it is same problem as case-insensitive comparing `strings` - solution is only change to lower or upper case both - [link](https://stackoverflow.com/a/319435/2901002) – jezrael Oct 10 '17 at 11:07