Regex Matching
Question (LC.10)
Match a given string with a regex pattern '.' Matches any single character '*' Matches zero or more of the preceding element
Example
("a", "ab*") => true
("", "b****") => invalid regex assume never happens
("aabceef", "*a*") => false
("minutes","m?nu*s") => falseDP Approach
This question is pretty much wildcard matching with a change of power on *. Here * is no longer a wildcard but a count of a capture group. For example, if we want to capture two lower case letters, we can do [a-z][a-z]. If we want to capture 0 to many lower case letters, we can do [a-z]*.
1. Define optimal subproblem
RX[i,j] = whether A[0:i] matches with B[0:j]
2. Solve by prefix
Case 1 a-z:
if A[i] == B[j]
RX[i,j] = RX[i-1,j-1] // match one and done
else
RX[i,j] = false
Case 2 .:
if B[j] == '.'
RX[i,j] = RX[i-1,j-1] // match one and done
Case 3:
if B[j] == '*'
Case 3.1:
if A[i] == B[j-1] || A[i-1] == '.'
RX[i,j] = RX[i,j-2] // skip ex. A* = ""
RX[i,j] = RX[i,j-1] // match one and done ex. A* = A
RX[i,j] = RX[i-1,j] // match more than one ex. A* = AAA
Case 3.2:
if A[i] != B[j-1]
RX[i,j] = RX[i,j-1] // skip
3. Init base case
RX[0,0] = true
Since each '*' can eliminate the charter before it,
the first row needs to be updated by the one before previous.
for j from 2 to n // j = 1 will be false
RX[0][j] = RX[0][j-2] and B[j-1] == '*'
// first col auto init to false
4. Topo order
row by row
for i from 1 to m
for j from 1 to n
5. Final answer
RX[m,n]DP Code
Time & Space Complexity
O(n^2)
Last updated