Bug 15851 - Sed ranges are broken in non-C locales
: Sed ranges are broken in non-C locales
Status: NEW
: ALT Linux Desktop
(All bugs in ALT Linux Desktop/bugs)
: 4.0.2
: all Linux
: P2 normal
Assigned To:
:
:
:
:
:
  Show dependency tree
 
Reported: 2008-05-30 18:47 by
Modified: 2008-06-13 13:00 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2008-05-30 18:47:52
When i use sed ranges (like [a-z]) in locales other than "C" uppercase letters
are matched by lowercase ranges.
In "C" locale this problem doesn't appear.

In other linux distributions (tested on Debian, Gentoo, Fedora) lowercase ranges
match only lowercase letters as one would expect. It seems this is because they
configure sed with --without-included-regex, while Alt does not.

Other utilities (like grep and tr) are not affected by this problem.
Steps to Reproduce:
1.Make sure that locale is not "C" (for example "ru_RU.UTF8")
2.echo abcdABCD | sed -e "s/[a-c]/0/g"
Actual Results:  
000d00CD

Expected Results:  
000dABCD
------- Comment #1 From 2008-05-30 19:42:29 -------
https://bugs.gentoo.org/show_bug.cgi?id=149526#c4

*** This bug has been marked as a duplicate of 13870 ***
------- Comment #2 From 2008-05-30 20:37:09 -------
(In reply to comment #1)
> https://bugs.gentoo.org/show_bug.cgi?id=149526#c4
> 
This problem has been solved in Gentoo (key --without-included-regex is used now).

Anyway sorting order should be the same for all utilities.

The following results are from the same locale:

[altlinux@localhost ~]$ echo abcdABCD | sed -e "s/[a-c]/0/g"
000d00CD
[altlinux@localhost ~]$ echo abcdABCD | grep -o "[a-c]"
a
b
c
[altlinux@localhost ~]$ echo abcdABCD | tr "[a-c]" 0
000dABCD

It can be obtained in both ru_RU.UTF-8 and ru_RU.KOI8-r locales so it's not the
UTF8 issue.
------- Comment #3 From 2008-06-13 13:00:03 -------
*** This bug has been confirmed by popular vote. ***