0

I'm sorry for the poor description, the root of the problem is that I'm not sure of the terminology.

This is not a GIS problem, but an example of a general problem: In python 2.7, how do I "freeze" a variable that takes too much time to calculate, like I might do with frozen_list_copy = original_list[:]?

Here's some code, simplified to focus on my problem. maketransformer() takes too long to run, and it outputs transformer, a GDAL (GIS stuff) coordinate system transformation object that works the same for every loop iteration. I only need to make the transformer object once and then repeatedly call a method from it. Every time the loop references the transformer variable assignment, it runs maketransformer()to make a new one.

def maketransformer(file): #contains arbitrary stuff that takes too long

    data = gdal.Open(file)
    output_wkt = data.GetProjection()
    srs_out = osr.SpatialReference()
    srs_out.ImportFromWkt(output_wkt)

    transformer = osr.CoordinateTransformation(srs_out)
    return transformer

transformer = maketransformer('data/w001001.adf')

for tuple in shapefilepoints:
    getelevation(transformer.Transform(tuple))

How might I run maketransformer() just once, and then "freeze" the object it produces, "disconnecting" it from its assignment so I can just use that object without running through the whole 0.1 second creation process 15,000 times?

What I'm trying to do is similar to:

>>> x = ['one', 'two', 'three']
>>> y = x
>>> z = x[:]     # I want to do this, but with a non-list object
>>> x[2] = 'purple'
>>> x
['one', 'two', 'purple']
>>> y
['one', 'two', 'purple']
>>> z     # So that the following happens when I use z
['one', 'two', 'three']

In short, what is the word that I google to describe this! Many thanks, apologies for the uselessly specific beginner question.

Edit: I found a helpful tool along the way in discovering what the holdup was:

>>>import profile
profile.run("maketransformer('data/w001001.adf')")

Profile broke down the function into its parts, returning the number and time of calls for everything it did, revealing that it's running in under 0.001 seconds. The problem lies elsewhere. Thanks, everyone, for teaching me all this helpful stuff.

  • what version of python are you using? 3.2 and up has `@functools.lru_cache` which can cache the results of running a function for you. – roippi Jan 04 '14 at 14:34
  • I'm using 2.7, but that's exactly what I'm looking for! Thank you. Edited question – Cegan Dodge Jan 04 '14 at 14:45
  • @CeganDodge Since you mention 2.7 you'd need to pick a memoization library: See [this post](http://stackoverflow.com/questions/1988804/what-is-memoization-and-how-can-i-use-it-in-python) for a few bits – Jon Clements Jan 04 '14 at 14:50
  • @JonClements good info, thank you – Cegan Dodge Jan 04 '14 at 15:07
  • The question is actually quite pointless now, as the question itself contains the answer and is based on some misunderstanding of the code. – Michael Jan 04 '14 at 15:09
  • @Michael I agree, is deleting the question the proper SO etiquette in this instance? – Cegan Dodge Jan 04 '14 at 15:23
  • When in doubt, leave it. Someone could be in a similar spot, search for "cache" or even "freeze object", stumble on your thread, find the word "memoization" in the comments, and learn something. Many SO threads are the top google hit for a certain phrase for that very reason. – roippi Jan 04 '14 at 16:42
  • Good point @roippi, that happens to me all the time. Cheers! – Cegan Dodge Jan 04 '14 at 17:20

2 Answers2

0

The python pickle module will 'reconstitute' an object from its data footprint (either from disk or a memory string) without having to go through the initialization code.

import pickle

...
transstr = dumps(maketransformer(...))

for ....
    getelevation(pickle.reads(transstr).Transform(tuple))

...

or something to that effect I believe would do what you want.

norlesh
  • 1,626
  • 10
  • 20
  • Perfect. I had the idea that pickle was for some other purpose, I'll dig deeper into it. Thank you! – Cegan Dodge Jan 04 '14 at 14:47
  • @CeganDodge: `pickle` is not the answer. Serializing and deserializing a data structure to make a copy isn't very efficient. – user2357112 supports Monica Jan 04 '14 at 14:54
  • @user2357112 I agree it is not efficient, and fixing the code at a higher level is probably a better solution - I was just proposing it as a solution to the stated question. – norlesh Jan 04 '14 at 15:03
  • @norlesh: I used pickle in a test before learning that I misinterpreted my problem, it works perfectly for what I thought the problem was, and will actually come in handy later for this project. Spot on! Thanks anyway. – Cegan Dodge Jan 04 '14 at 15:06
0
def maketransformer(file): #contains arbitrary stuff that takes too long

    data = gdal.Open(file)
    output_wkt = data.GetProjection()
    srs_out = osr.SpatialReference()
    srs_out.ImportFromWkt(output_wkt)

    transformer = osr.CoordinateTransformation(srs_out)
    return transformer

transformer = maketransformer('data/w001001.adf')

for tuple in shapefilepoints:
    getelevation(transformer.Transform(tuple))

As you've written it, this code does not rerun maketransformer every time. maketransformer is only run once, and then the same object is used for every Transform call. This is probably all you need.

user2357112 supports Monica
  • 215,440
  • 22
  • 321
  • 400
  • Thank you so much! Well, I shouldn't have trusted my perception. Now to find out which part of this whole thing is actually taking so long. – Cegan Dodge Jan 04 '14 at 14:59