-3

Jumping off from a previous question I asked a while back:

Why is 1000000000000000 in range(1000000000000001) so fast in Python 3?

If you do this:

1000000000000000.0 in range(1000000000000001)

...it is clear that range has not been optimized to check if floats are within the specified range.

I think I understand that the intended purpose of range is to work with ints only - so you cannot, for example, do something like this:

1000000000000 in range(1000000000000001.0)
# error: float object cannot be interpreted as an integer

Or this:

1000000000000 in range(0, 1000000000000001, 1.0) 
# error: float object cannot be interpreted as an integer

However, the decision was made, for whatever reason, to allow things like this:

1.0 in range(1)

It seems clear that 1.0 (and 1000000000000.0 above) are not being coerced into ints, because then the int optimization would work for those as well.

My question is, why the inconsistency, and why no optimization for floats? Or, alternatively, what is the rationale behind why the above code does not produce the same error as the previous examples?

This seems like an obvious optimization to include in addition to optimization for ints. I'm guessing there are some nuanced issues preventing a clean implementation of such optimization, or alternatively there is some kind of rationale as to why you would not actually want to include such an optimization. Or possibly both.

EDIT: To clarify the issue here a bit, all the following statements evaluated to False as well:

3.2 in range(5)
'' in range(1)
[] in range(1)
None in range(1)

This seems like unexpected behavior to me, but so far there is definitely no inconsistency. However, the following evaluates to True:

1.0 in range(2.0)

And as shown previously, constructions similar to the above have not been optimized.

This does seem inconsistent- at some point in the evaluation, the value 1.0 (or 1000000000001.0 as in my original example) is being coerced into an int. This makes sense since it is a natural thing to convert a float ending in .0 to an int. However, the question still remains: if it is being converted an int anyway, why has 1000000000000.0 in range(1000000000001) not been optimized?

Community
  • 1
  • 1
Rick supports Monica
  • 33,838
  • 9
  • 54
  • 100
  • 1
    In your edit, you seem to be assuming that `x in range(n)` is false for *any* float `x`. That's not true: try `1.0 in range(2)`, for example. – Mark Dickinson May 11 '16 at 19:05
  • @MarkDickinson Good point- now we are back to what I believe is an inconsistency, and the question as to WHY `1000000000000.0 in range(1000000000001)` has not been optimized- something that seems very obvious to do- still remains. Unfortunately I've already accepted an answer. – Rick supports Monica May 11 '16 at 20:08
  • @MarkDickinson I have re-edited the question. Thanks for pointing this out. – Rick supports Monica May 11 '16 at 20:33

1 Answers1

5

There is no inconsistency here. Floating point values can't be coerced to integers, that only works the other way around. As such, range() won't implicitly convert floats to integers when testing for containment either.

A range() object is a sequence type; it contains discrete integer values (albeit virtually). As such, it has to support containment testing for any object that may test as equal. The following works too:

>>> class ThreeReally:
...     def __eq__(self, other):
...         return other == 3
...
>>> ThreeReally() in range(4)
True

This has to do a full scan over all possible values in the range to test for equality with each contained integer.

However, only when using actual integers can the optimisation be applied, as that's the only type where the range() object can know what values will be considered equal without conversion.

Martijn Pieters
  • 889,049
  • 245
  • 3,507
  • 2,997
  • 1
    That makes sense. But there could still be optimization of checking if a `float` is within the range of two `int`s by converting the `int` boundaries to `floats`. This seems like a very obvious thing to do. Why not do it? – Rick supports Monica May 11 '16 at 18:48
  • @RickTeachey: because a `range()` consists of a series of *discrete* integers; non-integer decimal values in-between are not part of the range. – Martijn Pieters May 11 '16 at 18:50
  • If that's the case then `1.0 in range(1)` should produce an error. It doesn't. Inconsistency. – Rick supports Monica May 11 '16 at 18:50
  • 2
    @RickTeachey: no, the documentation for sequences states that `in` tests if there is a value in the object that is equal. `1.0 == 1` is equal, so `1.0 in range(10)` is true because there is an equal value in that sequence. Python just doesn't *optimise* it because that'd require an implicit cast, and is simply not a usecase for range objects. – Martijn Pieters May 11 '16 at 18:52
  • @RickTeachey: a `range()` object is a *sequence type*, and the integers it generates don't need to be contiguous, as there is a step size too. That the step defaults to 1 doesn't make the range any less a sequence of discrete integers. Where would 3.2 fit in `range(1, 5, 2)`? – Martijn Pieters May 11 '16 at 19:04
  • 1
    @RickTeachey I deleted my comment because it wasn't really relevant to the question of why the *behavior* of `1.0 in range(whatever)` is different from `1 in range(whatever)`, even if the return value is the same. – chepner May 11 '16 at 19:06
  • @chepner Note (MarkDickinson's comment)[http://stackoverflow.com/questions/37170878/why-no-optimization-of-python-3-range-object-for-floats/37170943#comment61877285_37170878] on the main question above: `1.0 in range(2)` DOES return `True`. So that means that the `float` is being converted to an `int` in that case, and assumed to be a discrete value. However, there is not optimization in this case, as shown by my original example code. There is still something to be explained here. – Rick supports Monica May 11 '16 at 20:11
  • @MartijnPieters See above comment; I have unmarked the answer as accepted since I now think there is inconsistency going on here after all- either in the treatment of a float ending in `.0` as an `int`, or in the lack of optimization for the same. – Rick supports Monica May 11 '16 at 20:12
  • 1
    @RickTeachey: I **already** address that in my question, with the `ThreeReally` example. `1.0 == 1` is true only because `float.__eq__` exists which does the conversion. The `range()` optimisation won't make assumptions about anything but actual integers, and does a full scan for those types, including `float`. – Martijn Pieters May 11 '16 at 20:40
  • I see so `range` isn't doing the conversion to `int` directly, it's just a result of the way the `float` comparison operator is implemented. Makes sense. Thanks for your patience. – Rick supports Monica May 11 '16 at 23:06