Wildcard Matching
Last updated
Last updated
Match a given string with a wildcard pattern
'?' Matches any single character
'*' Matches any sequence of characters (including empty)
From the usage of '*', we can clearly see there are overlapping problems. Also this problem is asking doability so 0 or 1 edge can translate into an optimization problem. But what's wrong with this DP solution?
The way it initializes the base cases obviously doesn't match with the definition of the subproblem. Yes, in 2D DP we normally initialize the first col and the first row. But always draw an example to made sure if it actually makes sense in the context of the question. So WD[0][0]
should be true
because an empty string matches up with an empty string. Then the rest of the first column should be all false
because empty string cannot match with anything else. The first row is more interesting. Consider case ""
and "******"
. The first row should be all true
in this set up. We want to keep the first row
true until we hit the first non-star character.
A second issue is the recurrence when B[j]
is star. Think about the definition of the subproblem. Consider this example "lol"
and "**"
. The current recurrence only gives WD[1,0]
. The intention of the current recurrence is clear - it tries to match 1 or more characters with one *
symbol. But we are doing DP. We are not solving these subproblems. They are already solved. All the dependencies are cached and ready to be reused. We only care about the current step (or the last step if we solve by prefix). Then, what are the possibilities for the *
in the current step? 1. It can stand by do nothing pass l
to rely on its previous buddy to match it 2. Or I can match with the l
to get the job done and call it a day (just as the average Joe ?
) 3. I need to match with l
and might need to match more in the future so you need to pay me overtime. (Or in the context of edit distance, we are deleting the character from s1) The third option might be hard to come up right away. Solving by prefix guarantees us that any previous subproblems are solved and taken care of, but it doesn't guarantee us any subproblems afterward.
Time and space are both O(mn)
. Space can be further reduced to O(m)
.