1 min read

LaTeX + Unicode → XeTeX

I’m co-organizing a scientific meeting at the end of May. The abstracts are all in.

We get them in an Excel file, and I was working on a Perl script to parse the file to create a LaTeX file with the abstracts, so we could have nicely formatted versions for review. (I’m using Spreadsheet:XLSX for the first time; it’s really easy. Why have I always converted Excel files to CSV before parsing them?)

I spent way too much time trying to deal with special characters. I was looking to do a search-and-replace for all possible Unicode characters (for example, to change \xE9 aka e into \'{e}, or \xD7 aka × into $\times$).

MBT/Pas × BALB/cByJ

But then I discovered that XeTeX supports Unicode, so there’s no need to do these sorts of substitutions.

I changed pdflatex to xelatex in my Makefile, and I’m done. I think.

Update: Now that I think about it, CSV is way more convenient than XLS(X) for simple data files, as you don’t have to traverse with the whole $cell -> (Val) business. But working with the Excel file directly is easier when the cells may contain lots of text with commas and such, like my abstracts.