Arjan's World: Regular Expressions *Can* Be Your Friend If You Treat Them Well
You are now being redirected to the new housing of Arjan's World. Click here in case nothing happens

Monday, October 10, 2005

Regular Expressions *Can* Be Your Friend If You Treat Them Well

Regular expressions always look like Perl to me... Incomprehensible.


Today I was looking for a way to handle some URLs in text to be displayed on a webpage. This specific webpage is fed some old input containing web links which could not be changed. The not-too-difficult task ahead was to change these old URLs which are set up according to a predictable scheme in such a way that they automatically appear allright on the new page according to the new scheme (the old text could just not be changed with a Find-And-Replace action because it still must be available for the old application). It was some time since I last used regexps and I can say I learned what is a greedy regular expression by working with one :)

The expression "/pathtourl/.*?/" did the trick. A first attempt did not include the ?, leading to a greedy expression. It keeps on searching for the last / character it can find. In my case that's normally the one in the anchor closing tag </a>. That way the complete URL plus the part between the tags up and until '</' is replaced, leading to some very invalid HTML.

So, as I mentioned the '?' did the trick..... However: I found not all URLs in the text always adhered to this principle. Some did not have a second '/' in the URL, leading to the same situation as described above :)

Sometimes I feel like a bad bugfixer when creating Regexps: fix the expression and see another bug popping up...

0 Comments:

Post a Comment

<< Home