Regex match help

takbb · June 15, 2021, 12:31pm

If you are doing regex for UK postcodes, then whilst XX99 9XX and X99 9XX are the major formats, you may also possibly need to handle a few other more unusual ones:

e.g. X9X 9XX (example, the BBC Broadcasting House - where I used to work - in London - has a postcode of W1A 1AA )

The full set of UK postcode formats is
XX9X 9XX
X9X 9XX
X9 9XX
X99 9XX
XX9 9XX
XX99 9XX

The following String Manipulation regex should find all these format, and allow for whitespace at the beginning, end or middle (though it doesn’t allow whitespace within either of the two component sections of the postcode)

regexMatcher($yourColumnName$,"^\\s*[A-Za-z]{1,2}(?:[0-9][A-Za-z]{0,1}|[0-9]{2})\\s+[0-9][A-Za-z]{2}\\s*$")

This requires the format to conform to the following:
^ start of line
\\s* any amount of whitespace (including none)
[A-Za-z]{1,2} one or two letters (don’t care about case)
(?: then from within the following group, either…
option 1 → [0-9][A-Za-z]{0,1} A single digit followed by zero or one letter, ignoring case
| or…
option 2 → [0-9]{2} two digits
) end of group
\\s+ one or more whitespace characters
[0-9][A-Za-z]{2} a single digit followed by two letters, any case
\\s* any amount of whitespace (including none)
$ end of line

The above regex is written to be case-insensitive but you might wish to change the letter ranges if that doesn’t suit.