Bug 15851

Summary: Sed ranges are broken in non-C locales
Product: ALT Linux Desktop Reporter: Julia Jomantaite <juliette>
Component: bugsAssignee: Anton V. Boyarshinov <boyarsh>
Status: CLOSED WONTFIX QA Contact: Andrey Cherepanov <cas>
Severity: normal    
Priority: P2    
Version: 4.0.2   
Hardware: all   
OS: Linux   

Description Julia Jomantaite 2008-05-30 18:47:52 MSD
When i use sed ranges (like [a-z]) in locales other than "C" uppercase letters
are matched by lowercase ranges.
In "C" locale this problem doesn't appear.

In other linux distributions (tested on Debian, Gentoo, Fedora) lowercase ranges
match only lowercase letters as one would expect. It seems this is because they
configure sed with --without-included-regex, while Alt does not.

Other utilities (like grep and tr) are not affected by this problem.
Steps to Reproduce:
1.Make sure that locale is not "C" (for example "ru_RU.UTF8")
2.echo abcdABCD | sed -e "s/[a-c]/0/g"
Actual Results:  
000d00CD

Expected Results:  
000dABCD
Comment 1 Andrey Rahmatullin 2008-05-30 19:42:29 MSD
https://bugs.gentoo.org/show_bug.cgi?id=149526#c4

*** This bug has been marked as a duplicate of 13870 ***
Comment 2 Julia Jomantaite 2008-05-30 20:37:09 MSD
(In reply to comment #1)
> https://bugs.gentoo.org/show_bug.cgi?id=149526#c4
> 
This problem has been solved in Gentoo (key --without-included-regex is used now).

Anyway sorting order should be the same for all utilities.

The following results are from the same locale:

[altlinux@localhost ~]$ echo abcdABCD | sed -e "s/[a-c]/0/g"
000d00CD
[altlinux@localhost ~]$ echo abcdABCD | grep -o "[a-c]"
a
b
c
[altlinux@localhost ~]$ echo abcdABCD | tr "[a-c]" 0
000dABCD

It can be obtained in both ru_RU.UTF-8 and ru_RU.KOI8-r locales so it's not the
UTF8 issue.
Comment 3 Mikhail Gusarov 2008-06-13 13:00:03 MSD

    
Comment 4 Andrey Cherepanov 2020-07-09 12:03:20 MSK
Поддержка дистрибутива завершена. Используйте новые версии.