开发者

python script in crontab gets input arguments of another process running at the same time

开发者 https://www.devze.com 2023-01-16 13:50 出处:网络
I run 2 python scripts from crontab at the same time each 30 min, e.g. 00,30 6-19 * * 0-5 /.../x.py site1

I run 2 python scripts from crontab at the same time each 30 min, e.g.

00,30 6-19 * * 0-5 /.../x.py site1
*/3 6-19 * * 0-5 /.../y.py site2

At the beginning the both scripts make import of a module that prints some 开发者_开发技巧data to a log, e.g.

name = os.path.basename(sys.argv[0])
site = sys.argv[1]
pid = os.getpid()

Occasionally (!) the second script y prints to the log input arguments of script x: name = x and site = site1 printed PID of the processes is not the same. Why this is happening and how can i avoid this?

P.S. I suspect the problem is related to the logger I use. Can a script use a logger created in another script? In this case it will print on each line data related to the first script. Each script executes the same code:

log = logging.getLogger('MyLog')
    log.setLevel(logging.INFO)
    dh = RotatingSqliteHandler(os.path.join(progDirs['log'],'sqlitelog'),processMeta, 5000000)
    log.addHandler(dh)

The logger handler is defined as follows:

class RotatingSqliteHandler(logging.Handler):
   def __init__(self, filename, progData, maxBytes=0):
       logging.Handler.__init__(self)
       self.user = progData['user']
       self.host = progData['host']
       self.progName = progData['name']
       self.site = progData['site']
       self.pid = random.getrandbits(50)
    .....

in the log I see that process ID which logger generates in the last line is the same for the both scripts.

I'll try to use logger name unique to each script run instead of 'MyLog'. Although it is strange that a logger instance can be got from another process.


When two scripts "run at the same time", the lines that they print can be mixed, depending on how the operating system allocates priority to the processes.

You can thus obtain, in your logs, something like:

x.py: /tmp/x.py
…
… # Other processes logging information
…
y.py: /tmp/y.py
x.py: site1  # Not printed by y!!
x.py: PID = 123
…
… # Other processes logging information
…
y.py: site2
y.py: PID = 124

Do you still observe the problem if you prefix each line by each program base name?


It's not possible for one Python process to access an object from another Python process, unless specific provision is made for this using e.g. the multiprocessing module. So I don't believe that's what's happening, no matter how much it looks like it on the surface.

To confirm this, use an alterative handler (such as a FileHandler or RotatingFileHandler) to see if the problem still occurs. If it doesn't, then you should examine the RotatingSqliteHandler logic.

If it does, and if you can come up with a small standalone script which demonstrates the problem repeatably, please post an issue to bugs.python.org and I'll certainly take a look. (I maintain the Python logging package.)


This question puzzles me: here is yet another idea! The random generator can be seeded with "the current system time" (if not source of random numbers exists on the computer). In Python 2.7, this is done through a call to time.time(). The point is that "not all systems provide time with a better precision than 1 second." More generally, would it be possible that sometimes your x.py and y.py run sufficiently close to each other that time.time() is the same for both processes, so that random.getrandbits(50) yields the same for both of them? This would be compatible with the problem only appearing exceptionally, as you observed.

What is the "resolution" of time.time() on your machine (smallest interval between different times)? maybe it's large enough for the two random generators to exceptionally be seeded in the same way.


Could this be related in any way to the following point?

getLogger() returns a reference to a logger instance with the specified name if it is provided, or root if not. The names are period-separated hierarchical structures. Multiple calls to getLogger() with the same name will return a reference to the same logger object.

Could both of your scripts be "connected" enough to each other that this plays a role? For instance, if y.py import x.py, then you get the same logger in both x.py and y.py when you call logging.getLogger('myLog') in each of them.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号