Ruby无法解析CSV文件:CSV :: MalformedCSVError(第1行中的非法引用) - 代码日志

Ruby无法解析CSV文件:CSV :: MalformedCSVError(第1行中的非法引用)

Ubuntu 12.04 LTS

Ruby ruby​​ 1.9.3dev(2011-09-23修订版33323)[i686-linux]

轨3.2.9

以下是我收到的CSV文件的内容:

"date/time","settlement id","type","order id","sku","description","quantity","marketplace","fulfillment","order city","order state","order postal","product sales","shipping credits","gift wrap credits","promotional rebates","sales tax collected","selling fees","fba fees","other transaction fees","other","total"
"Mar 1, 2013 12:03:54 AM PST","5481545091","Order","108-0938567-7009852","ALS2GL36LED","Solar Two Directional 36 Bright White LED Security Flood Light with Motion Activated Sensor","1","amazon.com","Amazon","Pasadena","CA","91104-1056","43.00","3.25","0","-3.25","0","-6.45","-3.75","0","0","32.80"

但是,当我尝试解析CSV文件时,我收到错误:

1.9.3dev :016 > options = { col_sep: ",", quote_char:'"' }
=> {:col_sep=>",", :quote_char=>"\""} 

1.9.3dev :022 > CSV.foreach("/tmp/my_data.csv", options) { |row| puts row }
CSV::MalformedCSVError: Illegal quoting in line 1.
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `each'
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `loop'
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `shift'
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1791:in `each'
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1208:in `block in foreach'
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1354:in `open'
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1207:in `foreach'
    from (irb):22
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/bin/irb:16:in `<main>'

然后我尝试简化数据,即

"name","age","email"
"jignesh","30","jignesh@example.com"

但是我仍然遇到同样的错误:

      1.9.3dev :023 > CSV.foreach("/tmp/my_data.csv", options) { |row| puts row }
  CSV::MalformedCSVError: Illegal quoting in line 1.
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `each'
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `loop'
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `shift'
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1791:in `each'
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1208:in `block in foreach'
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1354:in `open'
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1207:in `foreach'
      from (irb):23
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/bin/irb:16:in `<main>'

我再次尝试简化数据,如下所示:

name,age,email
jignesh,30,jignesh@example.com

并且它的工作原理。参见下面的输出:

  1.9.3dev :024 > CSV.foreach("/tmp/my_data.csv") { |row| puts row }
  name
  age
  email
  jignesh
  30
  jignesh@example.com
   => nil 

但是我会收到引用数据的CSV文件,所以删除引号解决方案实际上并不是我正在寻找。我无法弄清楚导致错误的原因是什么:CSV :: MalformedCSVError:我之前的例子中的第1行中的非法引用。

我已经通过在文本编辑器中启用“显示空白字符”和“显示行结束”,在CSV中没有前导/尾随空格。此外,我已经使用以下内容验证了编码。

  1.9.3dev :026 > File.open("/tmp/my_data.csv").read.encoding
  => #<Encoding:UTF-8> 

注意:我尝试使用CSV.read,但与该方法相同的错误。

有人可以帮我解决问题,让我明白错误的地方吗?

=====================

我刚刚发现以下帖子:http://www.ruby-forum.com/topic/448070并尝试以下:

  file_data = file.read
  file_data.gsub!('"', "'")
  arr_of_arrs = CSV.parse(file_data)

  arr_of_arrs.each do |arr|
    Rails.logger.debug "=======#{arr}"
  end

并得到以下输出:

   =======["\xEF\xBB\xBF'date/time'", "'settlement id'", "'type'", "'order id'", "'sku'", "'description'", "'quantity'", "'marketplace'", "'fulfillment'", "'order city'", "'order state'", "'order postal'", "'product sales'", "'shipping credits'", "'gift wrap credits'", "'promotional rebates'", "'sales tax collected'", "'selling fees'", "'fba fees'", "'other transaction fees'", "'other'", "'total'"]
    =======["'Mar 1", " 2013 12:03:54 AM PST'", "'5481545091'", "'Order'", "'108-0938567-7009852'", "'ALS2GL36LED'", "'Solar Two Directional 36 Bright White LED Security Flood Light with Motion Activated Sensor'", "'1'", "'amazon.com'", "'Amazon'", "'Pasadena'", "'CA'", "'91104-1056'", "'43.00'", "'3.25'", "'0'", "'-3.25'", "'0'", "'-6.45'", "'-3.75'", "'0'", "'0'", "'32.80'"]

这使得正确读取数据,因为使用的默认col_sep是逗号字符。
但是我尝试使用这样的quote_char选项:

  arr_of_arrs = CSV.parse(file_data, :quote_char => "'")

但它最终出现以下错误:

   CSV::MalformedCSVError (Illegal quoting in line 1.):

谢谢,
Jignesh

quote_chars = %w(" | ~ ^ & *)
begin
  @report = CSV.read(csv_file, headers: :first_row, quote_char: quote_chars.shift)
rescue CSV::MalformedCSVError
  quote_chars.empty? ? raise : retry 
end

它不完美,但它的工作时间大部分时间。

注: CSV.parse采用与CSV.read相同的参数,因此可以使用来自内存的文件或数据

http://stackoverflow.com/questions/16772830/ruby-unable-to-parse-a-csv-file-csvmalformedcsverror-illegal-quoting-in-line

本站文章除注明转载外,均为本站原创或编译
转载请明显位置注明出处:Ruby无法解析CSV文件:CSV :: MalformedCSVError(第1行中的非法引用)