开发者

Walking/iterating over a nested dictionary of arbitrary depth (the dictionary represents a directory tree)

开发者 https://www.devze.com 2023-04-12 05:54 出处:网络
Python newbie at time of writing. This came up because I want a user to be able to select a group of files from within a directory (and also any subdirectory), and unfortunately Tkinter\'s default abi

Python newbie at time of writing.

This came up because I want a user to be able to select a group of files from within a directory (and also any subdirectory), and unfortunately Tkinter's default ability for selecting multiple files in a file dialog is broken on Windows 7 (http://bugs.python.org/issue8010).

So I am attempting to represent a directory structure by an alternative method (still using Tkinter): constructing a facsimile of the directory structure, made of labeled and indented checkboxes (organized in a tree). So a directory like this:

\SomeRoot开发者_运维技巧Directory
    \foo.txt
    \bar.txt
    \Stories
        \Horror
            \scary.txt
            \Trash
                \notscary.txt
        \Cyberpunk
    \Poems
        \doyoureadme.txt

will look something like this (where # represents a checkbutton):

SomeRootDirectory
    # foo.txt
    # bar.txt
    Stories
        Horror
            # scary.txt
            Trash
                # notscary.txt
        Cyberpunk
    Poems
        # doyoureadme.txt

Building the original dictionary from the directory structure is easy using a certain recipe I found at ActiveState (see below), but I hit a wall when I try to iterate over the nicely nested dictionary I am left with.


Here is a function that prints all your file names. It goes through all the keys in the dictionary, and if they map to things that are not dictionaries (in your case, the filename), we print out the name. Otherwise, we call the function on the dictionary that is mapped to.

def print_all_files(directory):

    for filename in directory.keys():
        if not isinstance(directory[filename], dict):
            print filename
        else:
            print_all_files(directory[filename])

So this code can be modified to do whatever you want, but it's just an example of how you can avoid fixing the depth through use of recursion.

The key thing to understand is that each time print_all_files gets called, it has no knowledge of how deep it is in the tree. It just looks at the files that are right there, and prints the names. If there are directores, it just runs itself on them.


This is a preliminary code. Go through it and tell me where you face problems.

Parents={-1:"Root"}
def add_dir(level, parent, index, k):
    print "Directory"
    print "Level=%d, Parent=%s, Index=%d, value=%s" % (level, Parents[parent], index, k)
def add_file(parent, index, k):
    print "File"
    print "Parent=%s, Index=%d, value=%s" %  (Parents[parent], index, k)
def f(level=0, parent=-1, index=0, di={}):
    for k in di:
        index +=1
        if di[k]:
            Parents[index]=k
            add_dir(level, parent, index, k)
            f(level+1, index, index, di[k])
        else:
            add_file(parent, index, k)

a={
    'SomeRootDirectory': {
        'foo.txt': None,
        'bar.txt': None,
        'Stories': {
            'Horror': {
                'scary.txt' : None,
                'Trash' : {
                    'notscary.txt' : None,
                    },
                },
            'Cyberpunk' : None
            },
        'Poems' : {
            'doyoureadme.txt' : None
        }
    }
}

f(di=a)


I realize this is an old question, but I was just looking for a simple, clean way to walk nested dicts and this is the closest thing my limited searching has come up with. oadams' answer isn't useful enough if you want more than just filenames and spicavigo's answer looks complicated.

I ended up just rolling my own that acts similar to how os.walk treats directorys, except that it returns all key/value information.

It returns an iterator and for each directory in the "tree" of nested dicts, the iterator returns (path, sub-dicts, values) where:

  • path is the path to the dict
  • sub-dicts is a tuple of (key,dict) pairs for each sub-dict in this dict
  • values is a tuple of (key,value) pairs for each (non-dict) item in this dict

def walk(d):
    '''
    Walk a tree (nested dicts).
    
    For each 'path', or dict, in the tree, returns a 3-tuple containing:
    (path, sub-dicts, values)
    
    where:
    * path is the path to the dict
    * sub-dicts is a tuple of (key,dict) pairs for each sub-dict in this dict
    * values is a tuple of (key,value) pairs for each (non-dict) item in this dict
    '''
    # nested dict keys
    nested_keys = tuple(k for k in d.keys() if isinstance(d[k],dict))
    # key/value pairs for non-dicts
    items = tuple((k,d[k]) for k in d.keys() if k not in nested_keys)
    
    # return path, key/sub-dict pairs, and key/value pairs
    yield ('/', [(k,d[k]) for k in nested_keys], items)
    
    # recurse each subdict
    for k in nested_keys:
        for res in walk(d[k]):
            # for each result, stick key in path and pass on
            res = ('/%s' % k + res[0], res[1], res[2])
            yield res

Here is the code I used to test it, though it has a a couple other unrelated (but neat) stuff in it:

import simplejson as json
from collections import defaultdict

# see https://gist.github.com/2012250
tree = lambda: defaultdict(tree)

def walk(d):
    '''
    Walk a tree (nested dicts).
    
    For each 'path', or dict, in the tree, returns a 3-tuple containing:
    (path, sub-dicts, values)
    
    where:
    * path is the path to the dict
    * sub-dicts is a tuple of (key,dict) pairs for each sub-dict in this dict
    * values is a tuple of (key,value) pairs for each (non-dict) item in this dict
    '''
    # nested dict keys
    nested_keys = tuple(k for k in d.keys() if isinstance(d[k],dict))
    # key/value pairs for non-dicts
    items = tuple((k,d[k]) for k in d.keys() if k not in nested_keys)
    
    # return path, key/sub-dict pairs, and key/value pairs
    yield ('/', [(k,d[k]) for k in nested_keys], items)
    
    # recurse each subdict
    for k in nested_keys:
        for res in walk(d[k]):
            # for each result, stick key in path and pass on
            res = ('/%s' % k + res[0], res[1], res[2])
            yield res

# use fancy tree to store arbitrary nested paths/values
mem = tree()

root = mem['SomeRootDirectory']
root['foo.txt'] = None
root['bar.txt'] = None
root['Stories']['Horror']['scary.txt'] = None
root['Stories']['Horror']['Trash']['notscary.txt'] = None
root['Stories']['Cyberpunk']
root['Poems']['doyoureadme.txt'] = None

# convert to json string
s = json.dumps(mem, indent=2)

#print mem
print s
print

# json.loads converts to nested dicts, need to walk them
for (path, dicts, items) in walk(json.loads(s)):
    # this will print every path
    print '[%s]' % path
    for key,val in items:
        # this will print every key,value pair (skips empty paths)
        print '%s = %s' % (path+key,val)
    print

The output looks like:

{
  "SomeRootDirectory": {
    "foo.txt": null,
    "Stories": {
      "Horror": {
        "scary.txt": null,
        "Trash": {
          "notscary.txt": null
        }
      },
      "Cyberpunk": {}
    },
    "Poems": {
      "doyoureadme.txt": null
    },
    "bar.txt": null
  }
}

[/]

[/SomeRootDirectory/]
/SomeRootDirectory/foo.txt = None
/SomeRootDirectory/bar.txt = None

[/SomeRootDirectory/Stories/]

[/SomeRootDirectory/Stories/Horror/]
/SomeRootDirectory/Stories/Horror/scary.txt = None

[/SomeRootDirectory/Stories/Horror/Trash/]
/SomeRootDirectory/Stories/Horror/Trash/notscary.txt = None

[/SomeRootDirectory/Stories/Cyberpunk/]

[/SomeRootDirectory/Poems/]
/SomeRootDirectory/Poems/doyoureadme.txt = None


You can walk a nested dictionary using recursion

def walk_dict(dictionary):
    for key in dictionary:
        if isinstance(dictionary[key], dict):
           walk_dict(dictionary[key])
        else:
           #do something with dictionary[k]
           pass

Hope that helps :)


a={
    'SomeRootDirectory': {
        'foo.txt': None,
        'bar.txt': None,
        'Stories': {
            'Horror': {
                'scary.txt' : None,
                'Trash' : {
                    'notscary.txt' : None,
                    },
                },
            'Cyberpunk' : None
            },
        'Poems' : {
            'doyoureadme.txt' : None
        }
    }
}

def dict_paths(dictionary, level=0, parents=[], paths=[]):
  for key in dictionary:
    parents = parents[0:level]
    paths.append(parents + [key])
    if dictionary[key]:
      parents.append(key)
      dict_paths(dictionary[key], level+1, parents, paths)
  return paths

dp = dict_paths(a)
for p in dp:
    print '/'.join(p)
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号