开发者

log messages appearing twice with Python Logging

开发者 https://www.devze.com 2023-03-20 19:35 出处:网络
I\'m using Python logging, and for some reason, all of my messages are appearing twice. I have a module to configure logging:

I'm using Python logging, and for some reason, all of my messages are appearing twice.

I have a module to configure logging:

# BUG: It's outputting logging messages twice - not sure why - it's not the propagate setting.
def configure_logging(self, logging_file):
    self.logger = logging.getLogger("my_logger")
    self.logger.setLevel(logging.DEBUG)
    self.logger.propagate = 0
    # Format for our loglines
    formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(m开发者_Python百科essage)s")
    # Setup console logging
    ch = logging.StreamHandler()
    ch.setLevel(logging.DEBUG)
    ch.setFormatter(formatter)
    self.logger.addHandler(ch)
    # Setup file logging as well
    fh = logging.FileHandler(LOG_FILENAME)
    fh.setLevel(logging.DEBUG)
    fh.setFormatter(formatter)
    self.logger.addHandler(fh)

Later on, I call this method to configure logging:

if __name__ == '__main__':
    tom = Boy()
    tom.configure_logging(LOG_FILENAME)
    tom.buy_ham()

And then within say, the buy_ham module, I'd call:

self.logger.info('Successfully able to write to %s' % path)

And for some reason, all the messages are appearing twice. I commented out one of the stream handlers, still the same thing. Bit of a weird one, not sure why this is happening. I'm assuming I've missed something obvious.


You are calling configure_logging twice (maybe in the __init__ method of Boy) : getLogger will return the same object, but addHandler does not check if a similar handler has already been added to the logger.

Try tracing calls to that method and eliminating one of these. Or set up a flag logging_initialized initialized to False in the __init__ method of Boy and change configure_logging to do nothing if logging_initialized is True, and to set it to True after you've initialized the logger.

If your program creates several Boy instances, you'll have to change the way you do things with a global configure_logging function adding the handlers, and the Boy.configure_logging method only initializing the self.logger attribute.

Another way of solving this is by checking the handlers attribute of your logger:

logger = logging.getLogger('my_logger')
if not logger.handlers:
    # create the handlers and call logger.addHandler(logging_handler)


If you are seeing this problem and you're not adding the handler twice then see abarnert's answer here

From the docs:

Note: If you attach a handler to a logger and one or more of its ancestors, it may emit the same record multiple times. In general, you should not need to attach a handler to more than one logger - if you just attach it to the appropriate logger which is highest in the logger hierarchy, then it will see all events logged by all descendant loggers, provided that their propagate setting is left set to True. A common scenario is to attach handlers only to the root logger, and to let propagation take care of the rest.

So, if you want a custom handler on "test", and you don't want its messages also going to the root handler, the answer is simple: turn off its propagate flag:

logger.propagate = False


I'm a python newbie, but this seemed to work for me (Python 2.7)

while logger.handlers:
     logger.handlers.pop()


The handler is added each time you call from outside. Try Removeing the Handler after you finish your job:

self.logger.removeHandler(ch)


In my case I'd to set logger.propagate = False to prevent double printing.

In below code if you remove logger.propagate = False then you will see double printing.

import logging
from typing import Optional

_logger: Optional[logging.Logger] = None

def get_logger() -> logging.Logger:
    global _logger
    if _logger is None:
        raise RuntimeError('get_logger call made before logger was setup!')
    return _logger

def set_logger(name:str, level=logging.DEBUG) -> None:
    global _logger
    if _logger is not None:
        raise RuntimeError('_logger is already setup!')
    _logger = logging.getLogger(name)
    _logger.handlers.clear()
    _logger.setLevel(level)
    ch = logging.StreamHandler()
    ch.setLevel(level)
    # warnings.filterwarnings("ignore", "(Possibly )?corrupt EXIF data", UserWarning)
    ch.setFormatter(_get_formatter())
    _logger.addHandler(ch)
    _logger.propagate = False # otherwise root logger prints things again


def _get_formatter() -> logging.Formatter:
    return logging.Formatter(
        '[%(asctime)s] [%(name)s] [%(levelname)s] %(message)s')


This can also happen if you are trying to create a logging object from the parent file. For e.g. This is the main application file test.py

import logging

# create logger
logger = logging.getLogger('simple_example')
logger.setLevel(logging.DEBUG)

# create console handler and set level to debug
ch = logging.StreamHandler()
ch.setLevel(logging.DEBUG)

# create formatter
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')

# add formatter to ch
ch.setFormatter(formatter)

# add ch to logger
logger.addHandler(ch)

def my_code():
# 'application' code
    logger.debug('debug message')
    logger.info('info message')
    logger.warning('warn message')
    logger.error('error message')
    logger.critical('critical message')

And below is the parent file main.py

import test

test.my_code()

The output of this will print only once

2021-09-26 11:10:20,514 - simple_example - DEBUG - debug message
2021-09-26 11:10:20,514 - simple_example - INFO - info message
2021-09-26 11:10:20,514 - simple_example - WARNING - warn message
2021-09-26 11:10:20,514 - simple_example - ERROR - error message
2021-09-26 11:10:20,514 - simple_example - CRITICAL - critical message

But if we had a parent logging object, then it will be printed twice. For e.g. if this is the parent file

import test
import logging
logging.basicConfig(level=logging.DEBUG,format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')


test.my_code()

The the output will be

2021-09-26 11:16:28,679 - simple_example - DEBUG - debug message
2021-09-26 11:16:28,679 - simple_example - DEBUG - debug message
2021-09-26 11:16:28,679 - simple_example - INFO - info message
2021-09-26 11:16:28,679 - simple_example - INFO - info message
2021-09-26 11:16:28,679 - simple_example - WARNING - warn message
2021-09-26 11:16:28,679 - simple_example - WARNING - warn message
2021-09-26 11:16:28,679 - simple_example - ERROR - error message
2021-09-26 11:16:28,679 - simple_example - ERROR - error message
2021-09-26 11:16:28,679 - simple_example - CRITICAL - critical message
2021-09-26 11:16:28,679 - simple_example - CRITICAL - critical message


A call to logging.debug() calls logging.basicConfig() if there are no root handlers installed. That was happening for me in a test framework where I couldn't control the order that test cases fired. My initialization code was installing the second one. The default uses logging.BASIC_FORMAT that I didn't want.


It seems that if you output something to the logger (accidentally) then configure it, it is too late. For example, in my code I had

logging.warning("look out)"

...
ch = logging.StreamHandler(sys.stdout)
root = logging.getLogger()
root.addHandler(ch)

root.info("hello")

I would get something like (ignoring the format options)

look out
hello
hello

and everything was written to stdout twice. I believe this is because the first call to logging.warning creates a new handler automatically, and then I explicitly added another handler. The problem went away when I removed the accidental first logging.warning call.


I was struggling with the same issue in the context of multiple processes. (For the code see the docs which I was following almost verbatim.) Namely, all log messages originating from any of the child processes got duplicated.

My mistake was to call worker_configurer(),

def worker_configurer(logging_queue):
    queue_handler = logging.handlers.QueueHandler(logging_queue)
    root = logging.getLogger()
    root.addHandler(queue_handler)
    root.setLevel(level)

both in the child processes and also in the main process (since I wanted the main process to log stuff, too). The reason this led to trouble (on my Linux machine) is that on Linux the child processes got started through forking and therefore inherited the existing log handlers from the main process. That is, on Linux the QueueHandler got registered twice.

Now, preventing the QueueHandler from getting registered twice in the worker_configurer() function is not as trivial as it seems:

  • Logger objects like the root logger root have a handlers property but it is undocumented.

  • In my experience, testing whether any([handler is queue_handler for handler in root.handlers]) (identity) or any([handler == queue_handler for handler in root.handlers]) (equality) fails after forking, even if root.handlers seemingly contains the same QueueHandler. (Obviously, the previous two expressions can be abbreviated by queue_handler in root.handlers, since the in operator checks for both identity and equality in the case of lists.)

  • The root logger gets modified by packages like pytest, so root.handlers and root.hasHandlers() are not very reliable to begin with. (They are global state, after all.)

The clean solution, therefore, is to replace forking with spawning to prevent these kinds of multiprocessing bugs right from the start (provided you can live with the additional memory footprint, of course). Or to use an alternative to the logging package that doesn't rely on global state and instead requires you to do proper dependency injection but I'm digressing… :)

With that being said, I ended up going for a rather trivial check:

def worker_configurer(logging_queue):
    queue_handler = logging.handlers.QueueHandler(logging_queue)
    root = logging.getLogger()

    for handler in root.handlers:
        if isinstance(handler, logging.handlers.QueueHandler):
            return

    root.addHandler(queue_handler)
    root.setLevel(level)

Obviously, this will have nasty side effects the second I decide to register a second queue handler somewhere else.


From the docs:

"Loggers have the following attributes and methods. Note that Loggers should NEVER be instantiated directly, but always through the module-level function logging.getLogger(name). Multiple calls to getLogger() with the same name will always return a reference to the same Logger object".

Make sure you don't initialise your loggers with the same name I advise you to initialise the logger with __name__ as name param i.e:

import logging
logger = logging.getLogger(__name__)

NOTE: even if you init a loggers from other modules with same name, you will still get the same logger, therefore calling i.e logger.info('somthing') will log as many times as you initiated the logger class.


Adding to others' useful responses...

In case you are NOT accidentally configuring more than one handler on your logger:

When your logger has ancestors(root logger is always one) and they have their own handlers, they will also output when your logger outputs (by default), which will create duplicates. You have two options:

  • Don't propagate your log event to your ancestors by setting:
my_logger.propagate = False
  • If you only have one ancestor(root logger), you could directly configure their handler instead. For example:
# directly change the formatting of root's handler
root_logger = logging.getLogger()
roots_handler = root_logger.handlers[0]
roots_handler.setFormatter(logging.Formatter(': %(message)s')) # change format

my_logger = logging.getLogger('my_logger') # my_logger will use new formatting


I was getting a strange situation where console logs were doubled but my file logs were not. After a ton of digging I figured it out.

Please be aware that third party packages can register loggers. This is something to watch out for (and in some cases can't be prevented). In many cases third party code checks to see if there are any existing root logger handlers; and if there isn't--they register a new console handler.

My solution to this was to register my console logger at the root level:

rootLogger = logging.getLogger()  # note no text passed in--that's how we grab the root logger
if not rootLogger.handlers:
        ch = logging.StreamHandler()
        ch.setLevel(logging.INFO)
        ch.setFormatter(logging.Formatter('%(process)s|%(levelname)s] %(message)s'))
        rootLogger.addHandler(ch)


If you are using any config for logging, For instance log.conf

In .conf file you can do it by adding this line in the [logger_myLogger] section: propagate=0

[logger_myLogger]
level=DEBUG
handlers=validate,consoleHandler
qualname=VALIDATOR
propagate=0


I had the same issue. In my case, it was not due to handlers or duplicate initial configuration but a stupid typo. In main.py I was using a logger object but in my_tool.py I was directly calling to the logging module by mistake, hence after invoking functions from my_tool module everything was messed up and the messages appeared duplicated.

This was the code:

main.py

import logging
import my_tool

logger_name = "cli"
logger = logging.getLogger(logger_name)

logger.info("potato")
logger.debug("potato)
my_tool.function()
logger.info("tomato")

my_tool.py

import logging
logger_name = "cli"
logger = logging.getLogger(logger_name)
# some code
logging.info("carrot")

and the result

terminal

>> potato
>> potato
>> carrot
>> tomato
>> tomato


If you use the standard construction logger = logging.getLogger('mymodule') and then accidentally mistype loggger as logging i.e.

logger = logging.getLogger('mymodule')

# configure your handlers

logger.info("my info message")  # won't make duplicate 
logging.info("my info message")  # will make duplicate logs

then this will cause duplicate messages to come up because the call to logging creates a new logger.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号