onsdag, augusti 16, 2006

Byte-lexing JRuby.

Since I begun the effort to recreate our lexer with JFlex, me, Charles and Wes Nakamura have discussed how MRI's lexer can be so much faster than JRuby's. One thing discussed was that it could maybe be because JRuby lexes with Readers, instead of handling bytes directly. I didn't believe this would be a big difference, but I set out to test it, anyway. Basically, I replaced all references to chars with bytes, and Readers to InputStreams in our Lexer. There were a few other places that had to be changed. I haven't done anything else to the lexer (there were many places were we could do some good optimization). The results amazed me. My first test case was completely focused on only the parsing step. There is no evaluation in these numbers. First, trunk JRuby:

Did 10_000 parses in 4336
Did 10_000 parses in 3885
Did 10_000 parses in 3866
Did 10_000 parses in 3835
Did 10_000 parses in 3836
Did 10_000 parses in 3815
Did 10_000 parses in 3836
Did 10_000 parses in 3896
Did 10_000 parses in 3845
Did 10_000 parses in 3926
-- Full time for 1_000_000 parses: 39076

and with my byte enhancements:

Did 10_000 parses in 3595
Did 10_000 parses in 3214
Did 10_000 parses in 3215
Did 10_000 parses in 3205
Did 10_000 parses in 3204
Did 10_000 parses in 3184
Did 10_000 parses in 3205
Did 10_000 parses in 3235
Did 10_000 parses in 3204
Did 10_000 parses in 3215
-- Full time for 1_000_000 parses: 32476

These times show about 18% increase in parsing speed, just by working with raw bytes instead of characters.

After I integrated the changes and fixed a few bugs, I finally could test the gain in regular JRuby evaluation speed. My test case this time consists of requiring webrick and IRB. First, trunk JRuby:

real 0m5.191s
user 0m0.000s
sys 0m0.010s

And with the byte-patch:

real 0m4.743s
user 0m0.010s
sys 0m0.010s

So, for this particular test case, we get about 5-8%, depending on circumstances. I would never have imagined that it was such an expense to work with Readers instead of InputStreams. The question now is; is it worth it? Should we sacrifice full unicode (or at least non-astral plane) reading for speed? I'm not sure.

Inga kommentarer: