Java Regex Speed
There are all sorts of variations around I/O and so on, but my finding is that for this problem, the Java 1.4.2 regex processing is somewhere around twice as fast as Perl 5.8.1. Frankly, I’m astounded.
3 Comments RSS · Twitter
Behold, the power of UTF-8!
Tim's likely running over Unicode data. Perl 5 stores unicode in UTF-8 format, a variable-width storage form. It's really, really inefficient to access, though it does take up very little space. Java uses UTF-16, which is a fixed-width format. (And yes, I know about combining characters and alternate planes and such) I fully expect the place perl buys it big time is in the code that has to do character boundary checking. (This is one of the reasons Parrot's going with a fixed-width encoding scheme. Variable width schemes suck)
Dan,
If we use a normal method calling it lets say some 10,00,000 times,i have found a 10 time difference in the speed of a normal validation method and Regex.Is there any way in which i can speed up Regex as i need it for Validations.
Thanks,