I've changed some code that used a list to using a deque. I can no longer slice into it, as I get the error:
TypeError: sequence index must be integer, not 'slice'
Here's a REPL that shows the problem.
>>> import collections >>> d = collections.deque() >>> for i in range(3): ... d.append(i) ... >>> d deque([0, 1, 2]) >>> d[2:] Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: sequence index must be integer, not 'slice'
So, is there a workaround to support slicing into deques in Python?
deque_slice = collections.deque(itertools.islice(my_deque, 10, 20))
Indexing into a
deque requires following a linked list from the beginning each time, so the
islice() approach, skipping items to get to the start of the slice, will give the best possible performance (better than coding it as an index operation for each element).
You could easily write a
deque subclass that does this automagically for you.
class sliceable_deque(collections.deque): def __getitem__(self, index): if isinstance(index, slice): return type(self)(itertools.islice(self, index.start, index.stop, index.step)) return collections.deque.__getitem__(self, index)
Note that you can't use negative indices or step values with
islice. It's possible to code around this, and might be worthwhile to do so if you take the subclass approach. For negative start or stop you can just add the length of the deque; for negative step, you'll need to throw a
reversed() in there somewhere. I'll leave that as an exercise. :-)
The performance of retrieving individual items from the
deque will be slightly reduced by the
if test for the slice. If this is an issue, you can use an EAFP pattern to ameliorate this somewhat -- at the cost of making the slice path slightly less performant due to the need to process the exception:
class sliceable_deque(collections.deque): def __getitem__(self, index): try: return collections.deque.__getitem__(self, index) except TypeError: return type(self)(itertools.islice(self, index.start, index.stop, index.step))
Of course there's an extra function call in there still, compared to a regular
deque, so if you really care about performance, you really want to add a separate
slice() method or the like.
__delitem__for slices too. Quick question, why do you return
except:branch (which suggests you want cooperative inheritance), but you reference
collections.dequeexplicitly in the
try:block instead of using
super(which suggests you don't want cooperative inheritance). Am I missing something - wim 2018-04-24 04:49
If performance is a concern, consider a direct access/comprehension method as suggested in this answer. It's much faster than
islice on large collections:
import timeit setup = """ import collections, itertools d = collections.deque(range(10000)) """ print timeit.timeit('list(itertools.islice(d, 9000, 9010))', setup, number=10000) ## 0.631947040558 print timeit.timeit('[d[i] for i in range(9000, 9010)]', setup, number=10000) ## 0.0292208194733
As per @RaymondHettinger comment below, the comprehension method is only better when slices are short. On longer slices,
islice convincingly wins. For example, here are timings for slicing a 10,000 items deque from the offset 6000:
offset length islice compr 6000 10 400.496 46.611 6000 50 424.600 183.988 6000 90 432.277 237.894 6000 130 441.289 352.383 6000 170 431.299 404.596 6000 210 456.405 546.503 6000 250 448.895 575.995 6000 290 485.802 778.294 6000 330 483.704 781.703 6000 370 490.904 948.501 6000 410 500.011 875.807 6000 450 508.213 1045.299 6000 490 518.894 1010.203 6000 530 530.887 1192.784 6000 570 534.415 1151.013 6000 610 530.887 1504.779 6000 650 539.279 1486.802 6000 690 536.084 1650.810 6000 730 549.698 1454.687 6000 770 564.909 1576.114 6000 810 545.001 1588.297 6000 850 564.504 1711.607 6000 890 584.197 1760.793 6000 930 564.480 1963.091 6000 970 586.390 1955.199 6000 1010 590.706 2117.491
The comprehension does first few slices very fast, but the performance falls down dramatically as the length grows.
islice is slower on smaller slices, but its average speed is much better.
This is how I tested:
import timeit size = 10000 repeats = 100 setup = """ import collections, itertools d = collections.deque(range(%d)) """ % size print '%5s\t%5s\t%10s\t%10s' % ('offset', 'length', 'islice', 'compr') for offset in range(0, size - 2000, 2000): for length in range(10, 2000, 40): t1 = timeit.timeit('list(itertools.islice(d, %d, %d))' % (offset, offset + length), setup, number=repeats) t2 = timeit.timeit('[d[i] for i in range(%d, %d)]' % (offset, offset + length), setup, number=repeats) print '%5d\t%5d\t%10.3f\t%10.3f' % (offset, length, t1 * 100000, t2 * 100000)
__getitem__method as well - aaronasterling 2012-04-04 00:30
i > len(d)//2. However, when you need a longer slice, then indexing one-at-a-time is a bad idea. For example, you would get a much different answer if you tried this timing for the slice
9000:10000in a deque of 30000 elements - Raymond Hettinger 2012-04-04 08:00
dequeuses a stride technique, that's nice to know - kindall 2012-04-04 22:13
dequefor performance (as a circular buffer actually) and so copying data to a list each cycle isn't ideal - Drew Noakes 2012-04-04 00:02