0

I am writing a class to represent some data from a spacecraft that can provide many different data products, which can be used to compute other variables. Loading the raw data takes a few seconds per variable so each should only be loaded when needed, as not all variables will always be needed.

This is a minimal working example of what I mean:

class Spacecraft:
    def __init__(self, time_range, configuration='default'):
        self.time_range = time_range

        # Initialise all possible variables to None
        self.magnetometer = None
        self.temperature = None
        self.complicated_thing = None
        # There are 30+ of these

    def get_magnetometer(self):
        self.magnetometer = self.load_data('magnetometer')
        return self.magnetometer

    def get_temperature(self):
        self.temperature = self.load_data('temperature')
        return self.temperature

    def get_complicated_thing(self):
        # Check each variable has a value
        if self.temperature is None:
            get_temperature()
        if self.magnetometer is None:
            get_magnetometer()
        
        self.complicated_thing = self.temperature * self.magnetometer
        return self.complicated_thing


    # Plus more get_ functions for each instance variable

    def load_data(self, product_name):
        """Loads raw 'product_name' data as np array.
        """
        pass


# Use case 1: Only need limited number of parameters
only_need_magnetometer = Spacecraft('10:30-10:45')
plot(only_need_magnetometer.get_magnetometer())


# Use case 2: Need a different set of parameters
complicated_analysis = Spacecraft('10:45-11:00')
result = analysis(complicated_analysis.get_complicated_thing())

I'm not sure about initialising every variable to None in __init__(). I also don't like the lengthy checks at the beginning of each function that calculates a derived variable.

Is there a better/more pythonic way?

Ideally I would like to skip the initialisation to None & be able to define get_complicated_thing() as:

def get_complicated_thing(self):
    self.complicated_thing = self.temperature * self.magnetometer
    return self.complicated_thing

Where, if self.temperature or self.magnetometer don't yet exist, a getter is automatically called, e.g. get_magnetometer(). The use cases could then be simplified to:

# Use case 1: Only need limited number of parameters
only_need_magnetometer = Spacecraft('10:30-10:45')
plot(only_need_magnetometer.magnetometer)


# Use case 2: Need a different set of parameters
complicated_analysis = Spacecraft('10:45-11:00')
result = analysis(complicated_analysis.complicated_thing)

Which, to me at least, is much easier to read.

James
  • 3
  • 2

1 Answers1

0

There is indeed a construct that can be used: property. There are decorators which allow defining getters, setters and "destructors" for managed attributes:

In your case, you only need getters:

class MyClass:
    def __init__(self, *args, **kwargs):
        # do stuff

    @property
    def magnometer(self):
        # do heavy calculation/loading
        return result


    @property
    def complicated_thing(self):
        # we can access self.mangometer here (it will just run the magnometer bound function)
        return self.magnometer*2

Now you said that the calculations/loading operations are heavy. It is fairly easily possible to cache your calculated values as follows:

from functools import cache
...
    @property
    @cache
    def myproperty(self):
        return 42
...

Note, however, that cache was introduced in 3.9, there is a cached_property in Python 3.8 that you can use as an alternative to the chained decorators. Alternatively you can write your own cache along the lines of this thread.

Additional note: There is in principle nothing wrong with setting the attributes to None in __init__.

sim
  • 1,107
  • 9
  • 18