开发者

python appengine form-posted utf8 file issue

开发者 https://www.devze.com 2022-12-15 12:09 出处:网络
i am trying to form-post a sql file that consists on many INSERTS, eg. INSERT INTO `TABLE` VALUES (\'abcdé\', 2759);

i am trying to form-post a sql file that consists on many INSERTS, eg.

INSERT INTO `TABLE` VALUES ('abcdé', 2759);

then i use re.search to parse it and extract the fields to put into my own datastore. The problem is that, although the file contains accented characters (s开发者_Go百科ee the e is a é), once uploaded it loses it and either errors or stores a bytestring representation of it.

Heres what i am currently using (and I have tried loads of alternatives):

form = cgi.FieldStorage()
uFile = form['sql']
uSql = uFile.file.read()
lineX = uSql.split("\n") # to get each line

and so on.

has anyone got a robust way of making this work? remember i am on appengine so access to some libraries is restricted/forbidden


You mention utf8 in the Q's title but then never again: what are you doing (in terms of setting headers and checking them) to verify what encoding is in use? There should be headers of the form

Content-Type: text/plain; charset=utf-8

and the charset= part is where the encoding is specified. So what are the values upon sending and receiving this? If charset is erroneous, you may have to manually perform some encoding and decoding. To help us gauge what the encoding seems to be, besides the headers, what's the ord value of that accented-e? E.g., if the encoding was actually iso-8859-1, that ord value would be 233 (in decimal; 0xE9 in hex).

0

精彩评论

暂无评论...
验证码 换一张
取 消