Find and Replace with Regular Expressions in Google Sheets

google sheets with love in the title

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email

File under: anecdotes from my love affair with Google Sheets.

This post describes a handy trick for data wrangling in Google Sheets. If you’ve ever had to consolidate a bunch of messy data into a few stem patterns, read on.

I recently found myself needing to consolidate a bunch of Google Analytics page view data into a shortened list. The client had not made use of the ‘Exclude URL Query Parameters’ feature, so the page views included session IDs and other dynamic parameters that turned a few pages into a list of 1,500 unique URLs. I knew I could do this with regular expressions, and I thought to check if Sheets has a way to do it since Google includes support for regular expressions in many of their products. I was happy to find that Sheets was no exception. If you are not familiar with regular expressions, they are instances of a standard syntax for pattern matching text. Check out this post for some basic how-tos: Regular Expressions 101

The find and replace function in Google Sheets has a checkbox ‘Search using regular expressions’. Next to the checkbox, there is a link to their help page. Dive in there if you want to get into the real nitty-gritty. I’m going to describe one simple use of this functionality that doesn’t require in-depth knowledge of regular expressions.

This is a sample of the data I started with:

PagePageviews
/products/calendar_new?arrival=2017-02-04&nights;=4&session;=Nb1lmU0p1
/products/calendar_new?arrival=28+Jan+2017&nights;=4&session;=M0tpd0I01
/products/calendar_new?arrival=19+Jan+2017&nights;=6&session;=KZjV0cksz1
/products/calendar_new?arrival=21+Oct+2016&nights;=2&session;=b3I2SzUE1
/products/new?arrival=11+Oct+2016&nights;=4&session;=EVjNMZn41
/products/new?arrival=1+Jan+2017&nights;=6&session;=MR1lCR09aV1
/products/new?arrival=1+Nov+2016&nights;=4&session;=JeHFpZ2o1
/products/new?arrival=1+Oct+2016&nights;=7&session;=GRVV2SU91

To replace the “arrival” value in the URL, I use a regular expression that looks like this:

  arrival=.+?&

Here’s a breakdown of what that says:

  • Look for the text “arrival=”
  • . and + are special characters in regular expressions. The pattern .+ looks for a sequence of one or more of any character
  • The ? is also a special character – in this case, it indicates to stop searching after the first instance it finds of the & character
  • Altogether, it says to look for “arrival=” followed by anything until you reach an ampersand.

One thing you might be wondering: if ? is a special character, how do you match a ? in text? To do this, use a front slash to escape the ?, like this \?. There are a number of other special characters that need to be escaped. For example, periods and plus signs need to be escaped if you are searching for them as text, e.g. \. and \+

If you want to test out a regular expression to see if it matches what you are looking for, www.regexr.com is a great, free tool. It also has a concise reference on how to write regular expressions, including more detail on escape characters.

Here’s what my pattern looks like in the Find and replace box:

Google Sheets find and replace dialog

Notice that I am replacing it with “arrival=&” – I didn’t want to remove the parameter entirely, I just wanted to get rid of the dynamic values. You could also replace with nothing to get rid of the parameter completely. Next, I do the following replacements:

  • Find nights=.+?& and replace with nights=&
  • Find session=.+ and replace with session=

Once I’m done replacing, I run a pivot to get summary totals on the consolidated URLs:

PagePageviews
/products/calendar_new?arrival=&nights;=&session;=4
/products/new?arrival=&nights;=&session;=4

How cool is that? With Excel, I used to do this kind of thing with ridiculously complicated text formulas. Did I mention how much I love Google Sheets?

Nico Brooks

Nico Brooks

Nico loves marketing analytics, running, and analytics about running. At Two Octobers, Nico is driven to build a culture where people with diverse perspectives, great communication, and no small amount of talent accomplish more than they ever could as individuals. Learn more about Nico or read more blogs he has written.

Explore more blogs

Regular expressions 101

In this video tutorial, we walk through key building blocks of regular expressions using the incredibly powerful REGEXEXTRACT function in Google Sheets. To follow along,

➔ Read more

JOIN THE CONVERSATION