Hey folks 👋🏻,
Well, this is my first article, I am on the #100DaysOfCode challenge where I am learning Must know things as a developer. This is my week1 learning. So there might be a ton of mistakes as I go along writing it out, so please give me feedback so that I can work over it.
So Let’s Start !!!
Introduction
Whether you are a frontend or backend developer you must come across Regular Expression once in your career. I remember when I first use form validation to validate passwords. I have no idea what /^[a-z0-9_-]{6,18}$/
do. I just copy and paste from google.
So this blog was my trial to explain Regular Expression with visuals.
What is Regular Expression?
Regular Expressions are a way to describe patterns in string data. It’s both terribly awkward and extremely useful. Regular Expressions are difficult to learn - they have a very compact syntax that ends up looking like gibberish. However, they can be extremely powerful when it comes to form validation, find and replace strings, and/or searching through a body of text. Properly understanding regular expressions will make you a more effective programmer.
Create a Regular Expression
//Long Syntax --- new RegExp("expression","flags");
const longRegExp = new RegExp("[A-Z]+", 'g');
//Short Syntax --- /expression/flags
const shortRegExp = /[A-Z]+/g;
Slashes /.../
tell JavaScript that you are creating a regular expression.
Character classes
Character | Meaning |
\w | Word |
\W | NOT word |
\d | Digit |
\D | Not digit |
\s | Whitespace |
\S | NOT whitespace |
\t | Tabs |
\n | Line breaks |
. | Any character (except newline) |
Flags
Flags | Meaning |
g | All matches in the string will be replaced, not just the first. |
i | With this flag the search is case-insensitive: no difference between A and a . |
m | ^ and $match per line |
s | Enables “dotall” mode, that allows a dot . to match newline character \n |
u | Enables full Unicode (😊, 🤩...) support. |
y | “Sticky” mode, searching at the exact position in the text. |
Brackets and Grouping
[...]
: Sets of charactersSay you want to match any number. In a regular expression, putting a set of characters between square brackets makes that part of the expression match any of the characters between the brackets.
//1. Test string contains numbers. const str = "The Independence Year 1947"; const regExp = /[0123456789]/; //not recommendation use range console.log(regExp.test(str)); // ✅ Output → true //characters in the set, correspond to exactly one character in the match. //Example 2. Find "Free", then [d or o], then "m" const str2 = "Freedom"; const regExp = /Free[do]m/; //match for Freedm or Freeom. console.log(regExp.test(str2)); // ❌ Output → false
[.-.]
: RangeThere is another version for
/[0123456789]/
which will work for a range of items/[0-9]/
.\d
– is the same as[0-9]
,\w
– is the same as[a-zA-Z0-9_]
,\s
– is the same as[\t\n\v\f\r ]
const str = "The Independence Year 1947"; const regExp = /[0-9]/; console.log(regExp.test(str)); // ✅ Output → true
[^...]
: Excluding range[^aeiou]
– any character except'a'
,'e'
,'i'
,‘o’
or'u'
.[^0-9]
– any character except a digit, the same as\D
.[^\s]
– any non-space character, same as\S
.const str = "abc"; const regExp = /[^A-Za-z0-9]/ console.log(regExp.test(str));//any special character // ❌ Output → false
(...)
: Capturing GroupSay you want to use an operator like
*
or+
on more than one element at a time. Then you can use()
.const regExp = /woo+(hoo+)+/i; console.log(regExp.test("Woohoooohoohooo")); // ✅ Output → true //Domain name match const domain = "example.com"; const domainExp = /(\w+\.)+\w+/g; console.log(domainExp.test(domain)); // ✅ Output → true
Anchors: Word and string boundaries
^
: If you want to enforce that string must start with a specific pattern or string then use^
.const regExp = /^tiny/i; const str = "Tiny habits make a big difference."; console.log(regExp.test(str) ); // ✅ Output → true const str1 = "The Tiny habits make a big difference."; console.log(regExp.test(str1) ); // ❌ Output → false
$
: If you want to enforce that string must end with a specific pattern or string then use$
.const regExp = /progress$/i; const str = "Goals are good for setting a direction, but systems are best for making progress"; console.log(regExp.test(str) ); // ✅ Output → true const str1 = "Goals are good for setting a direction, but systems are best for making progress everday"; console.log(regExp.test(str1) ); // ❌ Output → false
Testing full match: Both
^
and$
together are mostly used to test match must span the whole string.Example: Let’s check whether or not a string is a number only.
const regExp = /^\d+$/i; const str = "123456789"; console.log(regExp.test(str) ); // ✅ Output → true const str1 = "1234e122"; console.log(regExp.test(str1) ); // ❌ Output → false
/b
: Word boundaryThis lets you inspect whether a string is at the beginning or at the end of a word:
const regExp = /\bworld\b/i; console.log(regExp.test("Hello, World!"));//In this world is standalone // ✅ Output → true console.log(regExp.test("Hello WorldMap!"));//In this world is not standalone // ❌ Output → false
Quantifiers
x?
: Zero or one occurrencesstr.match
explained 👇🏼. For not it returns the match in array with additional info.const str = "Life is Art Live yours in colour"; console.log(str.match(/colou?r/g)); // work with both ----> color, colour // Output → ["colour"]
x*
: Zero or more occurrencesconst str = "255 25 2"; console.log(str.match(/\d5*/g)); // Output → ['255', '25', '2']
x+
: One or more occurrencesconst str = "255 25 2"; console.log(str.match(/\d5+/g)); // Output → ['255', '25']
x{n}
: n occurrencesconst apples = "🍎🍎🍎"; console.log(apples.match(/🍎{3}/gu)); // u → Enables emoji 🤩 // Output → ["🍎🍎🍎"] const address = "Street: 432,SomeStreet, Local, City: Pune, Zip code: 411001"; //now if you want to get zip code console.log(address.match(/\d{6}/g)); // Output → ['411001']
x{n,m}
: n/m occurrences. n → min and m → maxIf you use
{n,}
it looks for sequences of digits of lengthn
or more.const visitStr ="I visited beautiful place on 01-30-2010." const regExp = /\d{1,2}-\d{1,2}-\d{4}/g; console.log(visitStr.match(regExp));//work with both date format → M-D-YYYY and MM-DD-YYYY // Output → ['01-30-2010'] const str = "+7(902)-223-25-87"; const numbers = str.match(/\d{1,}/g); console.log(numbers); // Output → ['7', '902', '223', '25', '87']
Methods
str.search(regExp)
: At what index is the match?The method
str.search(regExp)
returns the position of the first match or-1
if not found:const str = "Nothing changes if nothing changes"; const regExp1 = /changes/; console.log(str.search(regExp1)); // ✅ Output → 8 (first match position) const regExp2 = /life/; console.log(str.search(regExp2)); // ❌ Output → -1
str.match(regExp)
: Getting all group 0 capturesThe method
str.match(regExp)
finds matches forregExp
in the stringstr
.It has 2 modes:
If REGEX doesn’t have flag
g
: it returns the first match as an array with capturing groups and propertiesindex
(position of the match),input
(input string, equalsstr
):const str = "Take small steps Everyday and you'll eventually get there."; const regExp = /every(day)/i; const result = str.match(regExp); console.log( result[0] ); // Output → Everyday (full match) console.log( result[1] ); // Output → day (first capturing group) console.log( result.length ); // Output → 2 // Additional information: console.log( result.index ); // Output → 17 (match position) console.log( result.input ); // Output → Take small steps Everyday and you'll eventually get there. (source string)
If REGEX has flag
g
: then it returns an array of all matches as strings, without capturing groups and other details.const str = "Take small steps Everyday and you'll eventually get there."; const regExp = /every(day)/g; const result = str.match(regExp); console.log( result ); // Output → ["Everyday"] console.log( result[0] ); // Output → Everyday (full match) console.log( result.length ); // Output → 1
If there are no matches, no matter if there’s flag
g
or not,null
is returned.
str.matchAll(regExp)
: Getting an iterable overall match objects [ES2020]The method
matchAll()
must be called with theg
flag. It returns an iterable object with matches instead of an array. You can make a regular array from it usingArray.from
or usingfor..of
.Every match is returned as an array with capturing groups (the same format asstr.match
without flagg
).const book = "Atomic Habits An Easy & Proven Way to Build Good Habits & Break Bad Ones"; const regex = /Habi(t)[a-z]/g; const result = book.matchAll(regex); console.log(result); // Output → RegExpStringIterator{} → object RegExp String Iterator Array.from(result, (res) => console.log(res)); //Output →['Habits', 't', index: 7, input: 'Atomic Habits An Easy & Proven Way to Build Good Habits & Break Bad Ones', groups: undefined] //['Habits', 't', index: 49, input: 'Atomic Habits An Easy & Proven Way to Build Good Habits & Break Bad Ones', groups: undefined]
If there are no results, it returns an empty iterable object instead of
null
str.replace(str|regExp, str|function)
:If you want to not only search and match but replace Strings, the
replace()
method will do the job.Without
/g
and/y
, only the first occurrence is replaced:const date = "26-02-2022"; // replace first dashes by a slash const result = date.replace("-", "/"); console.log(result); // Output → 26/02-2022
With
/g
, all occurrences are replaced:const date = "26-02-2022"; // replace all dashes by a slash const result = date.replace("/-/g", "/"); console.log(result); // Output → 26/02/2022
💪🏻 Real Power of Replace - come into fact when you can refer to matched groups in the replacement string or use the function in replacement to second param.
For example: Say you have comma separated list of authors names in the format
Lastname Firstname
. If you want to swap these names and remove the comma to get aFirstname Lastname
with new line. format, you can use the following code:const authors = 'Clear James, Holiday Ryan, Housel Morgan'; const authorRegExp = /(\w+) (\w+),?/g; // ? → allow zero or one occurrence of comma, \w → alphanumeric character const rearrangeName = authors.replace(authorRegExp, "$2 $1\n"); console.log(rearrangeName); // Output → James Clear // Ryan Holiday // Morgan Housel
Suppose you want to replace some words in quote with UPPERCASE to highlight them .You can use the second argument as a function.
const str = "Aim for the moon. If you miss, you may hit a star."; const result = str.replace(/moon|star/gi, str => str.toUpperCase()); //pipe character (|) denotes a choice. console.log(result); // Output → Aim for the MOON. If you miss, you may hit a STAR.
regExp.exec(str)
: Capturing groupsThe method
regExp.exec(str)
returns a match forregExp
in the stringstr
. Unlike previous methods, it’s called on a REGEX, not on a string.There are 2 ways in which
exec
works:If REGEX doesn’t have flag
g
: Getting a match object for the first matchIf there’s no
g
, thenregExp.exec(str)
returns the first match just likestr.match(regExp)
const str = "You must MAKE a change to SEE a change"; const regExp1 = /change/; console.log(regExp1.exec(str)); // Output → { // 0: "change" // groups: undefined // index: 16 // input: "You must MAKE a change to SEE a change" // length: 1 // }
If REGEX has Flag
g
: you can loop over matches- A call to
regExp.exec(str)
returns the first match and saves the position immediately after it in the propertyregExp.lastIndex
. - The next such call starts the search from position
regExp.lastIndex
, returns the next match and saves the position after it inregExp.lastIndex
. - …And so on.
If there are no matches,
regExp.exec
returnsnull
and resetsregExp.lastIndex
to0
.const str = "You must MAKE a change to SEE a change"; const regExp = /change/ig; let match; while (match = regExp.exec(str)) { console.log(`Found ${match[0]} at ${match.index}, Next starts at ${regExp.lastIndex}.`); } // Output → Found change at 16 // Found change at 32
Before method
regExp.matchAll
[ES2020] added to JavaScript. calls ofregexp.exec
were used in the loop to get all matches with groups.
- A call to
regExp.test(str)
: Is there a match?The method
regExp.test(str)
looks for at least one match, if found, returnstrue
, otherwisefalse
.const str = "Hello World"; const regExp1 = /hello/i; // i - case insensitive console.log(regExp1.test(str)); // ✅ Output → true const regExp2 = /script/; console.log(regExp2.test(str)); // ❌ Output → false
Look-around :
There are many cases when you to find next or before by another pattern. This special syntax called “lookahead” and “lookbehind”, together referred to as “look-around”
Lookahead: matches for a pattern that are followed
Positive Lookahead:
(?=«pattern»)
matches ifpattern
matches what comes next.const regExp = /James(?= Clear)/; const str = "James is a writer."; console.log(regExp.test(str)); // ❌ Output → false const str1 = "James Clear is the author of the bestselling book."; console.log(regExp.test(str1)); // ✅ Output → true
Negative Lookahead:
(?!«pattern»)
matches ifpattern
does not match what comes next.const regExp = /James(?! Clear)/; const str = "James is a writer."; console.log(regExp.test(str)); // ✅ Output → true const str1 = "James Clear is the author of the bestselling book."; console.log(regExp.test(str1)); // ❌ Output → false
Lookbehind: matches for a pattern that are preceded
Positive Lookbehind:
(?<=«pattern»)
matches ifpattern
matches what came before.const regExp = /(?<=James) Clear/; const str = "James is a writer."; console.log(regExp.test(str)); // ❌ Output → false const str1 = "James Clear is the author of the bestselling book."; console.log(regExp.test(str1)); // ✅ Output → true
Negative Lookbehind:
(?<!«pattern»)
matches ifpattern
does not match what came before.cconst regExp = /(?<!James) Clear/; const str = "Ray Clear is a writer."; console.log(regExp.test(str)); // ✅ Output → true const str1 = "James Clear is the author of the bestselling book."; console.log(regExp.test(str1)); // ❌ Output → false
Backtracking
As the name specifies, Regular expressions store back-reference. When entering a branch, it remembers its current position so that it can go back and try another branch if the current one does not work out.
To find a match, the regex engine will consume characters one by one. When a partial match begins, the engine will remember the start position so it can go back in case the following characters don't complete the match.
- If the match is complete, there is no backtracking.
- If the match isn't complete, the engine will backtrack the string (like when you rewind an old tape) to try to find a whole match.
Let’s take one example \d{2}[a-z]{2}
. Where the First two characters should be \d
digit followed by the Second two-character between a-z
.
and try to match string abc123def
⚠️ In the above example, there is only one backtracking which is kind of ok. But sometimes regular expressions are looking simple but can execute a very long time, and even “hang” the JavaScript engine. In that case,
the Web browser suggests killing the script and reloading the page. Not a good thing for sure.
For server-side JavaScript such a regExp
may hang the server process, that’s even worse. So careful with Backtracking.
Using REGEX in VSCode
VSCode has a nice feature when using the search tool, it can search using regular expressions. You can click cmd+f
(on a Mac, or ctrl+f
on windows) to open the search tool, and then click cmd+option+r
to enable regex search.
For example, You have a large .json
file where you want to change the date format. You can replace all using vs code regex search and replace option.
Regex to find all dates - ((19|20)[0-9]{2})(0[0-9]|1[0-2])([0-2][0-9]|3[0-1])
. In this you divide date into 3 group. And replace it with group reference $3-$2-$1
Conclusion
So, This concludes Regular expression (Zero to Hero). There are many things you can try with REGEX
. I tried my best on the idea. I hope you learn something new today.
Happy Learning 👩🏼💻
References and Learning Resources
Book:
https://eloquentjavascript.net/09_regexp.html
Articles:
https://fireship.io/lessons/regex-cheat-sheet-js/
https://javascript.info/regular-expressions
https://flaviocopes.com/javascript-regular-expressions
https://www.janmeppe.com/blog/regex-for-noobs/
Learning playground:
https://regexlearn.com/playground
https://www.freecodecamp.org/learn/javascript-algorithms-and-data-structures/regular-expressions/
It is possible that I forgot to mention some references.