Many years ago I wrote a parser for KBibTeX for KDE3, which would load plain text BibTeX files into an internal data structure in memory. Part of this process is to interpret strings like {\"a} or --- and to protect inline math commands. Internally, this encoder/decoder would have a long table of mappings between LaTeX representations and Unicode representations. Detecting, extracting, and replacing LaTeX representations with Unicode characters was primarily realized using regular expressions.
After ruling out most bugs, it worked well. Indeed, this piece of code has survived into KBibTeX for KDE4 as of today.