Names in a Hat v2.1

24 Feb 2013

I just submitted version 2.1 of Names in a Hat to Apple. This is a relatively small update that (finally) brings iPhone 5 support.

3.0 is also in the works. It will be adding some requested features and perhaps a small UI change. More details will be coming soon. I plan to submit 3.0 to Apple by the middle of March.

Text::CSV::Easy

10 Feb 2013

So, I often need to create CSV files. Typically, I'm aggregating data and need to quickly export it to some kind of format that a non-techie could understand (the target is usually my wife). To accomplish this I could use Perl's defacto standard, Text::CSV.

my $csv = Text::CSV->new;
$csv->combine(@fields);
my $string = $csv->string;

Text::CSV does a lot. You can change the separator, allow whitespace, change the EOL, determine whether blank is undef or an empty string, allow loose quotes, etc. I don't need all these features. More importantly, I don't want to write three lines of code for something that should only take a single line. There are other modules, such as Text::CSV::Simple that claim to make CSV handling easier, but there are still things I don't like (OO, reading whole file only, etc).

Let me switch topics for a second. I work at Synacor (yes, we are hiring). One of my projects requires a lot of token processing in strings. The performance isn't bad, but it could be better. So I tasked one of my developers to look at porting the Perl code to C using XS. It also sparked my interest in XS, so I've been reading up on it during my spare time.

So now there are three things.

  1. I want to start contributing to CPAN.
  2. I want to mess around with XS.
  3. I want a better CSV parser module.

Enter Text::CSV::Easy

my $string = csv_build(@fields);

No object instantiation. A single line to build your CSV (and a single line to parse). It conforms to RFC 4180 which means it can handle the vast majority of CSV files out there without the need to configure anything.

There are two modules I've written. The first is Text::CSV::Easy. It has two packages, Text::CSV::Easy and Text::CSV::Easy_PP. Text::CSV::Easy_PP contains the pure Perl implementation of csv_build() and csv_parse(). The second module is Text::CSV::Easy_XS which contain the same subroutines as Text::CSV::Easy_PP but are just their XS counterpart. Text::CSV::Easy is a package which will use the XS version of the subroutines if that module is installed, otherwise it will fallback to the pure Perl version.

Comparison of Text::CSV::Easy_PP and Text::CSV::Easy_XS

I performed some basic performance test comparisons between Text::CSV::Easy_PP and Text::CSV::Easy_XS using the Benchmark module.

csv_build()

  Rate Text::CSV::Easy_PP Text::CSV::Easy_XS
Text::CSV::Easy_PP 119048/s -- -92%
Text::CSV::Easy_XS 1509434/s 1168% --

csv_parse()

  Rate Text::CSV::Easy_PP Text::CSV::Easy_XS
Text::CSV::Easy_PP 40424/s -- -96%
Text::CSV::Easy_XS 987654/s 2343% --

XS is 12x faster when building CSV strings, and up to 25x faster when parsing them.

I'm not entirely surprised at the huge gap with parsing CSV strings. The XS version of csv_parse() makes a single pass through the string and never has to backtrack. The PP version using a somewhat verbose regex (below) to parse the CSV string and it can take a fair amount of steps to find a field (fun experiment: test the PP version using Conway's Regexp::Debugger).

lib/Text/CSV/Easy_PP.pm

89  while (
90    $str =~ / (?:^|,)
91      (?: ""                # don't want a capture group here
92        | "(.*?)(?<![^"]")" # find quote which isn't being escaped
93        | ([^",\r\n]*)      # try to match an unquoted field
94      )
95      (?:\r?\n(?=$)|)       # allow a trailing newline only
96      (?=,|$) /xsg
97  )

Comparison of Text::CSV::Easy and Text::CSV

Obviously, my XS implementation is much faster than my PP one. But how does my XS version stack up against the XS version of Text::CSV?

CSV Building

  Rate Text::CSV_PP Text::CSV::Easy_PP Text::CSV_XS Text::CSV::Easy_XS
Text::CSV_PP 49505/s -- -59% -83% -97%
Text::CSV::Easy_PP 121766/s 146% -- -57% -92%
Text::CSV_XS 284698/s 475% 134% -- -82%
Text::CSV::Easy_XS 1600000/s 3132% 1214% 462% --

Note: The Text::CSV* modules used $csv->combine() and $csv->string. The Text::CSV::Easy* modules used csv_build().

CSV Parsing

  Rate Text::CSV_PP Text::CSV::Easy_PP Text::CSV_XS Text::CSV::Easy_XS
Text::CSV_PP 15549/s -- -61% -86% -98%
Text::CSV::Easy_PP 40343/s 159% -- -65% -96%
Text::CSV_XS 114286/s 635% 183% -- -88%
Text::CSV::Easy_XS 975610/s 6174% 2318% 754% --

Note: The Text::CSV* modules used $csv->parse() and $csv->fields. The Text::CSV::Easy* modules used csv_parse().

Text::CSV::Easy for the win! In all fairness, Text::CSV does a lot more than my module, but then again I don't care about all that extra stuff. I wanted something easy to use (as few lines as possible), standard (RFC 4180 compliant), and fast.

Links

DraftDay

1 Aug 2011

So I've been keeping myself busy since I finished version 2 of Names in a Hat. I decide to switch focus for a little bit on a new application called DraftDay. It's a MacOS based application which will manage a live fantasy draft.

Instead of using a boring spreadsheet, DraftDay will add excitement to your live draft experience. Picking a player will reveal a highlight reel as well as last season's stats. The team on the clock will have information on last year's record, some different facts and their current roster. Owners can submit their picks either directly into their app or text/email them in.

It's built using a mixture of Cocoa and Perl. The primary application is all built using Cocoa, so it's a native MacOS application (Lion+). I used a Perl Catalyst application to handle the outside interaction. For example, hitting the root URL will reveal a draft board. At /draft, you can login to make selections. /video will serve up videos to be consumed in a WebView in the Cocoa app. Stuff under /api is used for email communication.

I'll go into more detail later, but I plan to open source all of the code once it's complete and the code is cleaned up, commented and some basic tests are created.

Names in a Hat Released

7 Jul 2011

Names in a Hat has been approved by Apple. It's a free download if you have purchased a past version and it's just $0.99 otherwise. Please consider writing a review if you like it.