开发者

Parse a malformed CSV line

开发者 https://www.devze.com 2023-03-30 03:17 出处:网络
I am parsing the follow开发者_StackOverflow中文版ing CSV lines. I need to rescue malformed lines that look like \"Malformed\" below. What is a regular expression I can use to do this? What considerati

I am parsing the follow开发者_StackOverflow中文版ing CSV lines. I need to rescue malformed lines that look like "Malformed" below. What is a regular expression I can use to do this? What considerations do I need to make?

body = %(
"Sensitive",2416,159,"Test "Malformed" Failure",2789,111,7-24-11,1800,0600,"R2","12323","",""
"Sensitive",2742,107,"Test",2791,112,7-24-11,1800,0600,"R1","","",""
"Sensitive",2700,135,"Test",2792,113,7-24-11,1800,0600,"R1","12110","","")

rows = []
body.each_line do |line|
  begin
    rows << FasterCSV.parse_line(line)
  rescue FasterCSV::MalformedCSVError => e
    rows << line if rescue_from_malformed_line(line)
  rescue => e
    Rails.logger.error(e.to_s)
    Rails.logger.info(line)
  end
end


I am not sure how malformed your data is malformed, but here is one approach for that line.

> puts line
"Sensitive",2416,159,"Test "Malformed" Failure",2789,111,7-24-11,1800,0600,"R2","12323","",""
>
> puts line.scan /[\d.-]+|(?:"[^"]*"[^",]*)+/
"Sensitive"
2416
159
"Test "Malformed" Failure"
2789
111
7-24-11
1800
0600
"R2"
"12323"
""
""

Note: Tested on ruby 1.9.2p290


You could use a regex to replace the nested double quotes with single quotes before it's passed to the parser.

Something like

.gsub(/(?<!^|,)"(?!,|$)/,"'")
0

精彩评论

暂无评论...
验证码 换一张
取 消