We had a nice chat about the uses of regular expressions in R, and determined we use them mainly for dealing with messy data files, or mutating the file names of data files, and for doing some linguistics data analysis tasks. That doesn’t sound like much, but they’re really amazing, and once you’ve started to use them, you’ll wonder how you ever went without.
Check out some of the useful
base R functions that make use of them with
Description: ‘grep’, ‘grepl’, ‘regexpr’ and ‘gregexpr’ search for matches to argument ‘pattern’ within each element of a character vector: they differ in the format of and amount of detail in the results. ‘sub’ and ‘gsub’ perform replacement of the first and all matches respectively. ...
And if you use
tidyr, you’ll love to use them with
How do you get started? Check out RegexOne. Once you complete all the lessons you’ll be set for a good long while. There are many other resources on the Internet.
Note well for regular expression usage in R: You’ll learn that backslash (
\) gets used a lot in regular expressions. Well, it’s also a special character in R (for example, newline is
'\n'). For that reason, when you write regular expressions in R, you need to use 2 slashes – so
'\w' should actually be