开发者

Get a list of variables using regexp

开发者 https://www.devze.com 2023-04-13 07:28 出处:网络
I have a string expression like param1=123开发者_JS百科,param2=bbb I would like to get a list of similar {\'param1\':\'123\',\'param2\':\'bb\'}

I have a string expression like param1=123开发者_JS百科,param2=bbb

I would like to get a list of similar {'param1':'123','param2':'bb'}

Or at least ['param1=123','param2=bbb']

Unfortunately, the design gives the

re.match('^(\w+?=\w+?,?)+$','param1=123,param2=bbb').groups()

does not produce the desired result

of course - this is part of a larger expression, and I would like to get this result by using regexp


>>> dict(re.findall(r'(\w+)=(\w+)','param1=123,param2=bbb'))
{'param2': 'bbb', 'param1': '123'}


I'd suggest avoiding regexps and splitting on the delimiters. E.g. :

>>> sample = 'param1=123,param2=bbb'
>>> [ x.split('=',1) for x in sample.split(',') ]
[['param1', '123'], ['param2', 'bbb']]
>>> dict([ x.split('=',1) for x in sample.split(',') ])
{'param2': 'bbb', 'param1': '123'}


Regexes can only return strings. Each group in the pattern produces one string. You've only got one group in your pattern, so it can only return one string for that group. What you want isn't possible with a single match of a regex pattern.

Instead, you could use finditer to find a pattern many times in the string, but that breaks your requirement that this be part of a larger pattern.

Your only option is to match all the assignments as one string, then split on the commas afterward.


Your string looks very much like query string parameters. What about using Python's urlparse library? It won't work with commas as separators, but you could change them to semicolons.

params = 'param1=123,param2=bbb'
params2 = params.replace(',', ';')

import urlparse
urlparse.parse_qs(params2) => {'param2': ['bbb'], 'param1': ['123']}


For these answers, I assume you have a string with parameter name and parameter value pairs formatted just as in your example, like 'param1=value1,param2=value2,param3=value3"

This is a general regex that will parse out the pairs of parameter name (=) parameter value into groups for each match

(?<=^|,)([^=]*)=([^,]*)(?=,\s?)

If you want a string out like this {'param1':'123','param2':'bb'}, you can run this replacement regex:

match expression:       (?<=^|,)([^=]*)=([^,]*)(,?)
replace expression:     '\1':'\2'\3

... then encapsulate all of that in curly brackets { and }... feed that into an eval statement, and you have a dictionary. (I have NEVER programmed python, but...) I believe you could do the following:

inputString = "param1=value1,param2=value2,param3=value3"
myParamDictionary = eval('{' + re.sub("(?<=^|,)([^=]*)=([^,]*)(,?)", "'\1':'\2'\3", inputString)

...but I have NEVER programmed in Python... python's flexibility seems like there might be a better way...

If you simply want an array with the names and values (not identified except by their indexes being even or odd), you could use this expression in a re.findall(regex, subject) statement:

(?<=^|,)([^=]*)|(?<==)([^,]*)

...this either will match the part after a comma (,) but before an equals sign (=) or it will match the part after an equals sign but before a comma. It will match zero-length names and values., so that the indexes can represent the type of data. To match only names or values with at least one character, use + instead of * - doing so may cause the indexes to be misaligned.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号