The PyYAML package loads unmarked strings as either unicode or str objects, depending on their content.
I would like to use unicode objects throughout my program (and, unfortunately, can't switch to Python 3 just yet).
Is there an easy way to force PyYAML to always strings load unicode objects? I do not want to clutter my YAML with !!python/unicode tags.
# Encoding: UTF-8
import yaml
menu= u""开发者_StackOverflow中文版"---
- spam
- eggs
- bacon
- crème brûlée
- spam
"""
print yaml.load(menu)
Output: ['spam', 'eggs', 'bacon', u'cr\xe8me br\xfbl\xe9e', 'spam']
I would like: [u'spam', u'eggs', u'bacon', u'cr\xe8me br\xfbl\xe9e', u'spam']
Here's a version which overrides the PyYAML handling of strings by always outputting unicode. In reality, this is probably the identical result of the other response I posted except shorter (i.e. you still need to make sure that strings in custom classes are converted to unicode or passed unicode strings yourself if you use custom handlers):
# -*- coding: utf-8 -*-
import yaml
from yaml import Loader, SafeLoader
def construct_yaml_str(self, node):
    # Override the default string handling function 
    # to always return unicode objects
    return self.construct_scalar(node)
Loader.add_constructor(u'tag:yaml.org,2002:str', construct_yaml_str)
SafeLoader.add_constructor(u'tag:yaml.org,2002:str', construct_yaml_str)
print yaml.load(u"""---
- spam
- eggs
- bacon
- crème brûlée
- spam
""")
(The above gives [u'spam', u'eggs', u'bacon', u'cr\xe8me br\xfbl\xe9e', u'spam'])
I haven't tested it on LibYAML (the c-based parser) as I couldn't compile it though, so I'll leave the other answer as it was.
Here's a function you could use to use to replace str with unicode types from the decoded output of PyYAML:
def make_str_unicode(obj):
    t = type(obj)
    if t in (list, tuple):
        if t == tuple:
            # Convert to a list if a tuple to 
            # allow assigning to when copying
            is_tuple = True
            obj = list(obj)
        else: 
            # Otherwise just do a quick slice copy
            obj = obj[:]
            is_tuple = False
        # Copy each item recursively
        for x in xrange(len(obj)):
            obj[x] = make_str_unicode(obj[x])
        if is_tuple: 
            # Convert back into a tuple again
            obj = tuple(obj)
    elif t == dict: 
        for k in obj:
            if type(k) == str:
                # Make dict keys unicode
                k = unicode(k)
            obj[k] = make_str_unicode(obj[k])
    elif t == str:
        # Convert strings to unicode objects
        obj = unicode(obj)
    return obj
print make_str_unicode({'blah': ['the', 'quick', u'brown', 124]})
 
         
                                         
                                         
                                         
                                        ![Interactive visualization of a graph in python [closed]](https://www.devze.com/res/2023/04-10/09/92d32fe8c0d22fb96bd6f6e8b7d1f457.gif) 
                                         
                                         
                                         
                                         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论