tsJensen

A quest for software excellence...

.NET Regex Fixed on x64

The Bug - Submitted June 5, 2006
On June 5, 2006, I submitted a bug to Microsoft their Connect site and expected that it was an exercise in futility. Who would listen to me? Here's what I submitted to them.

Title: Long pattern string results in race condition on x64 system but not x86 system
A long pattern string in a Regex constructor works fine on my x86 Windows XP development machine but results in a race condition that eats RAM very fast until an "out of memory" condition occurs an the process is killed on the x64 Windows Server 2003 machine. In steps below, I will cut and paste the code which resulted in the condition--the input exceeds 2000 characters so I will remove some of the lines that concatenate the pattern string but one can easily add additional lines to achieve the result that I experienced. To resolve it or work around it, I simply split the Regex into 19 Regex objects and that resolved the problem.

This happened after developing a pattern to remove unwanted words from text before I run the keyword extraction algorithm I wrote based on this whitepaper by a brilliant young researcher named Yutaka Matsuo and the honorable professor Mitsuru Ishizuka at the University of Tokyo.

Fixed - July 7, 2006
I received an email from the Connect system to let me know the bug had been fixed. Of course, it was a nice, poorly formatted plain text email spit out by the system. Where's the love? Still, I was impressed that the system (hence somebody who programmed the system) bothered to send a note to let me know that the bug was confirmed and fixed.

I'm sure a million or more of my readers have already enjoyed this experience, but for those small few that have not, I thought I'd share. So don't hold back. When you find a bug in the framework or just want to complain, visit the Connect site and see what happens.