CsvMapper gets a little more magical

Have you checked out CsvMapper by Luke Pillow? It makes some of the drudgery of CSV parsing fade away. It really is a fantastic library. As an example from the doc:

Given the csv


First Name,Last Name,Age

John,Doe,27

Jane,Doe,26

Bat,Man,52

...etc...

you can do

include CsvMapper



results = import('/path/to/file.csv') do

  start_at_row 1

  [first_name, last_name, age]

end



results.first.first_name  # John

results.first.last_name   # Doe

results.first.age         # 27

Cool, right? There’s a lot of magic built in, but I felt there was an extra-mile left to go… By default CsvMapper requires you to specify the columns you want. Most of the time you want to do this, but it doesn’t really feel DRY in cases where the column names are made redundant by matching those on the first line of the file. And its a pain to reproduce column names when there are many, many fields. So, presto:

include CsvMapper



results = import('/path/to/file.csv') do

  read_attributes_from_file

end



results.first.first_name  # John

results.first.last_name   # Doe

results.first.age         # 27

Sweet, you say, but what if my field names are something absurd and clunky? Fine, say I, just specify an alias for them.

results = import('/path/to/file.csv') do

  read_attributes_from_file('First Name' => 'what_my_friends_call_me')

end



results.first.what_my_friends_call_me  # John

What about more realistic example with a bigger csv file? OK, here’s one from Subsidyscope:

results = import('subsidyscope-tarp.csv'){read_attributes_from_file}

results.first  # <OpenStruct type_of_institution="holding company", city="New York", 

               # description="Preferred Stock w/Warrants", 

               # total_assets="1856207282000.00", date="2008-10-28", 

               # fdic_number="1039502", ots_number=nil, transaction_type="Purchase",

               # price_paid="25000000000.00", state="NY", 

               # name="JPMorgan Chase & Co.", stock_symbol="JPM",

               # pricing_mechanism="Par", regulator="Federal Reserve">

We didn’t have to define a single attribute name; the csv file did it for us.

Ok, I’m sold, you say. Where can I get this? Try my CsvMapper github fork until it maybe gets merged into the official branch.