Behdad Esfahbod
2003-11-10 05:36:19 UTC
Hi,
Was tweaking with the grep patch, and also tracking another
thread in another list, which was showing how on Red Hat 9 a
simple text intensive program (called hspell) is much slower than
Red Hat 8, and investigations have shown so far that it's all
caused by /lib/tls. Switching to /lib/i686 makes things go much
faster. Any idea? And it's not a multi-threaded application.
So I focus on sed: Pretty slow on non-C locales:-(
[***@mces behdad]$ echo $LANG
en_US.UTF-8
[***@mces behdad]$ ll /bin/ls
-rwxr-xr-x 1 root root 73460 Oct 12 04:50 /bin/ls
[***@mces behdad]$ time sed -e 's/./x/g' /bin/ls > /dev/null
real 0m4.248s
user 0m3.800s
sys 0m0.000s
[***@mces behdad]$ time LANG=C sed -e 's/./x/g' /bin/ls > /dev/null
real 0m0.180s
user 0m0.050s
sys 0m0.000s
[***@mces behdad]$
And /bin/ls is only 72kb!!!
But you should have noticed that /bin/ls is not a valid UTF-8
piece. Actually /bin/ls is very small, if you run it on a bigger
piece of garbage (speaking encoding ofcourse), it's hard not to
get a SegFault. I have reported that here:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=109606
Any idea? May some one look if the same caching can be done
here?
behdad
Was tweaking with the grep patch, and also tracking another
thread in another list, which was showing how on Red Hat 9 a
simple text intensive program (called hspell) is much slower than
Red Hat 8, and investigations have shown so far that it's all
caused by /lib/tls. Switching to /lib/i686 makes things go much
faster. Any idea? And it's not a multi-threaded application.
So I focus on sed: Pretty slow on non-C locales:-(
[***@mces behdad]$ echo $LANG
en_US.UTF-8
[***@mces behdad]$ ll /bin/ls
-rwxr-xr-x 1 root root 73460 Oct 12 04:50 /bin/ls
[***@mces behdad]$ time sed -e 's/./x/g' /bin/ls > /dev/null
real 0m4.248s
user 0m3.800s
sys 0m0.000s
[***@mces behdad]$ time LANG=C sed -e 's/./x/g' /bin/ls > /dev/null
real 0m0.180s
user 0m0.050s
sys 0m0.000s
[***@mces behdad]$
And /bin/ls is only 72kb!!!
But you should have noticed that /bin/ls is not a valid UTF-8
piece. Actually /bin/ls is very small, if you run it on a bigger
piece of garbage (speaking encoding ofcourse), it's hard not to
get a SegFault. I have reported that here:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=109606
Any idea? May some one look if the same caching can be done
here?
behdad