23[us] spent in [py2] to process ( x in range( 10E+0000 ) )
4[us] spent in [py2] to process ( x in range( 10E+0001 ) )
3[us] spent in [py2] to process ( x in range( 10E+0002 ) )
37[us] spent in [py2] to process ( x in range( 10E+0003 ) )
404[us] spent in [py2] to process ( x in range( 10E+0004 ) )
4433[us] spent in [py2] to process ( x in range( 10E+0005 ) )
45972[us] spent in [py2] to process ( x in range( 10E+0006 ) )
490026[us] spent in [py2] to process ( x in range( 10E+0007 ) )
2735056[us] spent in [py2] to process ( x in range( 10E+0008 ) )
MemoryError
A syntax of in range( a )
constructor is not only slow in [TIME]
-domain, having --at best-- O(log N), if done smarter, than a pure sequential search through the enumerated domain of list-ed values, but
in
py2, the native range()
always has also a composite add-on O( N )
costs of both the [TIME]
-domain costs ( a time to build ) and also the [SPACE]
-domain costs ( allocating the space to store + spending more time to put all those data through ... ) of such a range
-based memory-representation construction.
Let's benchmark a safe, O( 1 )
scaled approach ( +always do benchmark )
>>> from zmq import Stopwatch
>>> aClk = Stopwatch()
>>> a = 123456789; x = 123456; aClk.start(); _ = ( 0 <= x < a );aClk.stop()
4L
>>> a = 123456789; x = 123456; aClk.start(); _ = ( 0 <= x < a );aClk.stop()
3L
It takes 3 ~ 4 [us]
to evaluate the condition-based formulation, having O( 1 ) scaling, invariant to x
magnitude.
Next, test the very same using an x in range( a )
formulation:
>>> a = 123456789; x = 123456; aClk.start(); _ = ( x in range( a ) );aClk.stop()
and your machine will almost freeze in memory-throughput bound CPU-starvations ( not mentioning the nasty swap spillovers from costs ranges of some ~ 100 [ns]
several orders of magnitude higher into some ~ 15.000.000 [ns]
costs of swap-disk IO data-flows ).
No, no, no. Never a way to test x
being inside a bounded range.
Ideas to create some other, class based evaluator, that still approaches the problem via an enumeration ( set ) will never be able to meet the benchmarked 3 ~ 4 [us]
( if not using some extraterrestrial wizardry beyond my understanding of cause-effect laws in classical and quantum physics )
Python 3 has changed the way, how the range()
-constructor works, but this was not the core merit of the original post:
3 [us] spent in [py3] to process ( x in range( 10E+0000 ) )
2 [us] spent in [py3] to process ( x in range( 10E+0001 ) )
1 [us] spent in [py3] to process ( x in range( 10E+0002 ) )
2 [us] spent in [py3] to process ( x in range( 10E+0003 ) )
1 [us] spent in [py3] to process ( x in range( 10E+0004 ) )
1 [us] spent in [py3] to process ( x in range( 10E+0005 ) )
1 [us] spent in [py3] to process ( x in range( 10E+0006 ) )
1 [us] spent in [py3] to process ( x in range( 10E+0007 ) )
1 [us] spent in [py3] to process ( x in range( 10E+0008 ) )
1 [us] spent in [py3] to process ( x in range( 10E+0009 ) )
2 [us] spent in [py3] to process ( x in range( 10E+0010 ) )
1 [us] spent in [py3] to process ( x in range( 10E+0011 ) )
In Python 2, neither range()
not xrange()
escape from the trap of O( N )
scaling, where xrange()
-generator seems to operate about just 2x less slow
>>> from zmq import Stopwatch
>>> aClk = Stopwatch()
>>> for expo in xrange( 8 ):
... a = int( 10**expo); x = a-2; aClk.start(); _ = ( x in range( a ) );aClk.stop()
...
3L
8L
5L
40L
337L
3787L
40466L
401572L
>>> for expo in xrange( 8 ):
... a = int( 10**expo); x = a-2; aClk.start(); _ = ( x in xrange( a ) );aClk.stop()
...
3L
10L
7L
77L
271L
2772L
28338L
280464L
The range-bounds syntax enjoys O( 1 )
constant time of ~ < 1 [us]
, as demonstrated already above, so the yardstick to compare agains was set:
>>> for expo in xrange( 8 ):
... a = int( 10**expo); x = a-2; aClk.start(); _ = ( 0 <= x < a );aClk.stop()
...
2L
0L
1L
0L
0L
1L
0L
1L