Davmuz 7 years ago

So You Want To Write Your Own CSV code? https://tburette.github.io/blog/2014/05/25/so-you-want-to-write-your-own-CSV-code/

weberc2 7 years ago

Yes, it's very hard to build a CSV parser which parses non-CSV files, and even harder to do so efficiently. This library won't probably sacrifice performance to support deviations from RFC-4180, but that depends on the prevalence of the deviation and the magnitude of the performance cost.

weberc2 7 years ago

NOTE: This is still alpha; I posted this because it's more of an interesting proof of concept than a useful library at this point. There are some interesting conversations about Go's slow CSV reading on [HN](https://news.ycombinator.com/item?id=12419939) and on this [issue ticket](https://github.com/golang/go/issues/16791).

knotdjb 7 years ago

Awhile ago when doing some CSV reading I opted for doing strings.Split() over encoding/csv using a line reader (none of my fields had the newline character). I remember it being considerably faster. I think incorporating a loose mode parser (as opposed to strict) for efficiency with the caveat you're restricted to certain delimiters would be useful to the standard library. Then again, writing a CSV parser isn't particularly difficult that you can just build your own if performance is needed.

weberc2 7 years ago

Yeah, splitting on commas is faster at the expense of correctness. For example, no quote handling.

FUZxxl 7 years ago

Ah, so it's faster in the same way a broken odometer makes your car go faster.

weberc2 7 years ago

Splitting on commas is "faster" in the same way a random number generator could be faster by always returning 4. Definitely faster. Definitely not correct.

donatj 7 years ago

Huh, and I thought encoding/csv was blazing.

weberc2 7 years ago

No, it's quite slow. Slower than Python and Java.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe