开发者

Returning from a function while continuing to execute

开发者 https://www.devze.com 2023-04-06 18:59 出处:网络
I am working on a Django application where I would like to populate several fields within my model when an object is first created. Currently I am able to do this in the save() routine of my 开发者_开

I am working on a Django application where I would like to populate several fields within my model when an object is first created. Currently I am able to do this in the save() routine of my 开发者_开发知识库model like so:

def save(self, *args, **kwargs):
    file = fileinfo.getfileinfo(self.file_path)
    if not self.file_size:
        self.file_size = file.FileSize
    if not self.file_inode:
        self.file_inode = file.FileInode
    if not self.duration:
        self.duration = file.Duration
    if not self.frame_width:
        self.frame_width = file.ImageWidth
    if not self.frame_height:
        self.frame_height = file.ImageHeight
    if not self.frame_rate:
        self.frame_rate = file.VideoFrameRate
    super(SourceVideo, self).save(*args, **kwargs)

I created a function called getfileinfo within a separate module called fileinfo. This is what part of my function looks like:

def getfileinfo(source):
    fstats = os.stat(source)
    info = dict({
        u'FileSize': fstats.st_size,
        u'FileInode': fstats.st_ino
    })
    output = subprocess.Popen(
        [exiftool, '-json', source], stdout=subprocess.PIPE)
    info.update(
        json.loads(output.communicate()[0], parse_float=decimal.Decimal)[0])
    return DotDict(info)

Although all of this works, I would like to avoid blocking the save process should the retrieval process be delayed for some reason. The information is not needed at object creation time and could be populated shortly thereafter. My thought was that I would alter my function to accept both the file path in question as well as the primary key for the object. With this information, I could obtain the information and then update my object entry as a separate operation.

Something like:

def save(self, *args, **kwargs):
    fileinfo.getfileinfo(self.file_path, self.id)
    super(SourceVideo, self).save(*args, **kwargs)

What I would like help with is how to return from the function prior to the actual completion of it. I want to call the function and then have it return nothing as long as it was called correctly. The function should continue to run however and then update the object on its end once it is done. Please let me know if I need to clarify something. Also, is thing even something work doing?

Thanks


Your best bet in this case is to use celery.

This enables you to create tasks that will occur in the background, without blocking the current request.

In your case, you can .save(), create the task that updates the fields, push it to your celery queue, and then return the desired response to the user.


I don't know your requirements, but if this operation takes an unacceptable time on save but an acceptable one on access, I would consider treating FileSize, Duration, VideoFrameRate, etc, as lazy-loaded properties of the model, assuming that a longer initial load time is a decent trade-off for a shorter save time.

There are many ways you can do this: you could cache the frame rate, for instance, with the caching framework the first time it's accessed. If you prefer to make it something stored in the database, you could access the frame rate via a property, and calculate it (and other values, if appropriate), the first time it's accessed and then store them in the database. Theoretically, these are attributes of the file itself, and therefore your interface shouldn't allow them to be changed and hence made out of sync with the file they refer to. Along those lines, I might do something like this:

class MyMediaFile(models.Model):
    file = models.FileField()
    _file_size = models.IntegerField(null=True, editable=False)
    _duration = models.IntegerField(null=True, editable=False)
    <... etc ...>

    @property
    def file_size(self):
        if self._file_size:
            return self._file_size
        else:
            self.populate_file_info(self)
            return self._file_size

    def populate_file_info(self):
        < ... do your thing here ... >
        self._file_size = my_calcuated_file_size
        < ... etc ... >

The logic of each property can easily be split into a general lazy-loading @property so the boilerplate doesn't need to be repeated for each one.


I don't know if your specific case will work like this, but what I would probably do is spawn a new thread pointing at your super.save, like so:

import threading

#other code here
def save(self, *args, **kwargs):
    fileinfo.getfileinfo(self.file_path, self.id)
    my_thread = threading.Thread(target=super(SourceVideo, self).save,
            args=args, kwargs=kwargs)
    my_thread.start()

This way save will run in the background while the rest of your code executes.

This will only work, however, if save doesn't block any data that might be needed elsewhere while the execution takes place.


What it sounds like you really want to do is return an object that represents the work that still needs to be done, and then attach a completion handler or observer to that returned object, which populates the model object with the results and then calls super.save().

Caveat being that I'm not sure how well this kind of approach fits into the Django application model.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号