Summary: | GROUP value printed in Russian in other locales | ||||||
---|---|---|---|---|---|---|---|
Product: | Sisyphus | Reporter: | imz <vanyaz> | ||||
Component: | rpm | Assignee: | placeholder <placeholder> | ||||
Status: | CLOSED FIXED | QA Contact: | qa-sisyphus | ||||
Severity: | normal | ||||||
Priority: | P5 | CC: | at, glebfm, imz, ldv, placeholder, vt | ||||
Version: | unstable | ||||||
Hardware: | all | ||||||
OS: | Linux | ||||||
Attachments: |
|
Description
imz
2002-10-08 23:24:04 MSD
It\'s because of algorithm implemented in rpm. lib/header.c:headerFindI18NString() checks environment variables in this order: LC_ALL, LC_MESSAGES, LANG. It\'s because of algorithm implemented in rpm. lib/header.c:headerFindI18NString() checks environment variables in this order: LC_ALL, LC_MESSAGES, LANG. What does it mean: \"checks environment variables in this order\": latter override former or vice versa? This still doen\'t explain why only changing LANG with the rest of the environment remaining the same we can get either English (with invalid (?) LANG=\'\', LANG=en and LANG=de) or Russian (LANG=de_AT.UTF-8) Group names: [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANG=de_AT.UTF-8 rpmquery -q minicom --qf=%{GROUP}\\\\n | iconv -f utf-8 -t koi8-r Коммуникации [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANG=de rpmquery -q minicom --qf=%{GROUP}\\\\n Communications [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ What does it mean: \"checks environment variables in this order\": latter override former or vice versa? This still doen\'t explain why only changing LANG with the rest of the environment remaining the same we can get either English (with invalid (?) LANG=\'\', LANG=en and LANG=de) or Russian (LANG=de_AT.UTF-8) Group names: [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANG=de_AT.UTF-8 rpmquery -q minicom --qf=%{GROUP}\\\\n | iconv -f utf-8 -t koi8-r Коммуникации [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANG=de rpmquery -q minicom --qf=%{GROUP}\\\\n Communications [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ Here are the corresponding locale values: LC_MESSAGES is unset, LC_ALL is empty (also unset). [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ locale LANG=ru_RU.KOI8-R LC_CTYPE=\"ru_RU.KOI8-R\" LC_NUMERIC=\"ru_RU.KOI8-R\" LC_TIME=\"ru_RU.KOI8-R\" LC_COLLATE=\"ru_RU.KOI8-R\" LC_MONETARY=\"ru_RU.KOI8-R\" LC_MESSAGES=\"ru_RU.KOI8-R\" LC_PAPER=\"ru_RU.KOI8-R\" LC_NAME=\"ru_RU.KOI8-R\" LC_ADDRESS=\"ru_RU.KOI8-R\" LC_TELEPHONE=\"ru_RU.KOI8-R\" LC_MEASUREMENT=\"ru_RU.KOI8-R\" LC_IDENTIFICATION=\"ru_RU.KOI8-R\" LC_ALL= [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANG=de_AT.UTF-8 locale LANG=de_AT.UTF-8 LC_CTYPE=\"de_AT.UTF-8\" LC_NUMERIC=\"de_AT.UTF-8\" LC_TIME=\"de_AT.UTF-8\" LC_COLLATE=\"de_AT.UTF-8\" LC_MONETARY=\"de_AT.UTF-8\" LC_MESSAGES=\"de_AT.UTF-8\" LC_PAPER=\"de_AT.UTF-8\" LC_NAME=\"de_AT.UTF-8\" LC_ADDRESS=\"de_AT.UTF-8\" LC_TELEPHONE=\"de_AT.UTF-8\" LC_MEASUREMENT=\"de_AT.UTF-8\" LC_IDENTIFICATION=\"de_AT.UTF-8\" LC_ALL= [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANG=de locale LANG=de LC_CTYPE=\"de\" LC_NUMERIC=\"de\" LC_TIME=\"de\" LC_COLLATE=\"de\" LC_MONETARY=\"de\" LC_MESSAGES=\"de\" LC_PAPER=\"de\" LC_NAME=\"de\" LC_ADDRESS=\"de\" LC_TELEPHONE=\"de\" LC_MEASUREMENT=\"de\" LC_IDENTIFICATION=\"de\" LC_ALL= [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ Here are the corresponding locale values: LC_MESSAGES is unset, LC_ALL is empty (also unset). [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ locale LANG=ru_RU.KOI8-R LC_CTYPE=\"ru_RU.KOI8-R\" LC_NUMERIC=\"ru_RU.KOI8-R\" LC_TIME=\"ru_RU.KOI8-R\" LC_COLLATE=\"ru_RU.KOI8-R\" LC_MONETARY=\"ru_RU.KOI8-R\" LC_MESSAGES=\"ru_RU.KOI8-R\" LC_PAPER=\"ru_RU.KOI8-R\" LC_NAME=\"ru_RU.KOI8-R\" LC_ADDRESS=\"ru_RU.KOI8-R\" LC_TELEPHONE=\"ru_RU.KOI8-R\" LC_MEASUREMENT=\"ru_RU.KOI8-R\" LC_IDENTIFICATION=\"ru_RU.KOI8-R\" LC_ALL= [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANG=de_AT.UTF-8 locale LANG=de_AT.UTF-8 LC_CTYPE=\"de_AT.UTF-8\" LC_NUMERIC=\"de_AT.UTF-8\" LC_TIME=\"de_AT.UTF-8\" LC_COLLATE=\"de_AT.UTF-8\" LC_MONETARY=\"de_AT.UTF-8\" LC_MESSAGES=\"de_AT.UTF-8\" LC_PAPER=\"de_AT.UTF-8\" LC_NAME=\"de_AT.UTF-8\" LC_ADDRESS=\"de_AT.UTF-8\" LC_TELEPHONE=\"de_AT.UTF-8\" LC_MEASUREMENT=\"de_AT.UTF-8\" LC_IDENTIFICATION=\"de_AT.UTF-8\" LC_ALL= [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANG=de locale LANG=de LC_CTYPE=\"de\" LC_NUMERIC=\"de\" LC_TIME=\"de\" LC_COLLATE=\"de\" LC_MONETARY=\"de\" LC_MESSAGES=\"de\" LC_PAPER=\"de\" LC_NAME=\"de\" LC_ADDRESS=\"de\" LC_TELEPHONE=\"de\" LC_MEASUREMENT=\"de\" LC_IDENTIFICATION=\"de\" LC_ALL= [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ Several more strange examples: [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANG=de_AT.UTF-8 rpmquery -q minicom --qf=%{GROUP}\\\\n | iconv -f utf-8 -t koi8-r Коммуникации [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=de LANG=de_AT.UTF-8 rpmquery -q minicom --qf=%{GROUP}\\\\n | iconv -f utf-8 -t koi8-r Communications [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=de_AT.UTF-8 LANG=de_AT.UTF-8 rpmquery -q minicom --qf=%{GROUP}\\\\n | iconv -f utf-8 -t koi8-r Communications [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ echo $LANGUAGE ru_RU.KOI8-R [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ Several more strange examples: [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANG=de_AT.UTF-8 rpmquery -q minicom --qf=%{GROUP}\\\\n | iconv -f utf-8 -t koi8-r Коммуникации [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=de LANG=de_AT.UTF-8 rpmquery -q minicom --qf=%{GROUP}\\\\n | iconv -f utf-8 -t koi8-r Communications [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=de_AT.UTF-8 LANG=de_AT.UTF-8 rpmquery -q minicom --qf=%{GROUP}\\\\n | iconv -f utf-8 -t koi8-r Communications [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ echo $LANGUAGE ru_RU.KOI8-R [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ If we assume that LANGUAGE is the variable that specifies the language of Group value (and neither LANG nor LC_*), then this test remains unexplained: [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=ru_RU.KOI8-R LANG=invalid rpmquery -q minicom --qf=%{GROUP}\\\\n Communications If we assume that LANGUAGE is the variable that specifies the language of Group value (and neither LANG nor LC_*), then this test remains unexplained: [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=ru_RU.KOI8-R LANG=invalid rpmquery -q minicom --qf=%{GROUP}\\\\n Communications I\'ve planned to add LANGUAGE to the head of the variables list rpm uses in that algorithm. rpm-4.1 also checks LANGUAGE. (first nonempty variable from that list is used.) I\'ve planned to add LANGUAGE to the head of the variables list rpm uses in that algorithm. rpm-4.1 also checks LANGUAGE. (first nonempty variable from that list is used.) After reading the info page on \"Using gettextized software\", I think that 2 parallel list hav to be used by rpm for getting different information: (LANGUAGE, ) LC_ALL, LC_MESSAGES, LANG should be examined to get the language (perhaps, a short name like \"de\"), and LANGUAGE, LC_ALL, LC_CTYPE, LANG for the codeset to convert to. I\'m not sure whether LANGUAGE should be included in the second list. It seems that rpm does all this already (without LANGUAGE in the 2nd list): [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=de LC_CTYPE=ru_RU.KOI8-R rpmquery -q minicom --qf=%{GROUP}\\\\n Communications [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=ru LC_CTYPE=ru_RU.KOI8-R rpmquery -q minicom --qf=%{GROUP}\\\\n Коммуникации [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=en LC_CTYPE=ru_RU.KOI8-R rpmquery -q minicom --qf=%{GROUP}\\\\n Communications [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=ru LC_CTYPE=de_DE.UTF-8 rpmquery -q minicom --qf=%{GROUP}\\\\n | iconv -f utf-8 -t koi8-r Коммуникации [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=ru LC_CTYPE=de rpmquery -q minicom --qf=%{GROUP}\\\\n | iconv -f utf-8 -t koi8-r Communications [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=ru LC_CTYPE=invalid rpmquery -q minicom --qf=%{GROUP}\\\\n | iconv -f utf-8 -t koi8-r Communications [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=de:ru LC_CTYPE=ru_RU.KOI8-R rpmquery -q minicom --qf=%{GROUP}\\\\n Коммуникации [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=de:ru_RU.UTF-8 LC_CTYPE=ru_RU.KOI8-R rpmquery -q minicom --qf=%{GROUP}\\\\n Коммуникации [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=de:ru_RU.UTF-8 LC_CTYPE=de_DE.ISO8859-1 rpmquery -q minicom --qf=%{GROUP}\\\\n ???????????? If LC_TYPE is invalid, it outputs the English value. I do not see any important misbahaviour. It seems it was my error to report this bug because I didn\'t understand how all the locale variables work together quiet well. This concerns GROUP. I see misbahaviour in recoding SUMMARY. They should be recoded just as the GROUPs are (and the error messages), I think. A testing script is attached. I run it twice: cat /user/imz/test_i18n_rpm_language.sh | sh cat /user/imz/test_i18n_rpm_language.sh | sed -e \'s!LANGUAGE!LC_MESSAGES!g\' | sh The results are different. After reading the info page on \"Using gettextized software\", I think that 2 parallel list hav to be used by rpm for getting different information: (LANGUAGE, ) LC_ALL, LC_MESSAGES, LANG should be examined to get the language (perhaps, a short name like \"de\"), and LANGUAGE, LC_ALL, LC_CTYPE, LANG for the codeset to convert to. I\'m not sure whether LANGUAGE should be included in the second list. It seems that rpm does all this already (without LANGUAGE in the 2nd list): [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=de LC_CTYPE=ru_RU.KOI8-R rpmquery -q minicom --qf=%{GROUP}\\\\n Communications [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=ru LC_CTYPE=ru_RU.KOI8-R rpmquery -q minicom --qf=%{GROUP}\\\\n Коммуникации [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=en LC_CTYPE=ru_RU.KOI8-R rpmquery -q minicom --qf=%{GROUP}\\\\n Communications [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=ru LC_CTYPE=de_DE.UTF-8 rpmquery -q minicom --qf=%{GROUP}\\\\n | iconv -f utf-8 -t koi8-r Коммуникации [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=ru LC_CTYPE=de rpmquery -q minicom --qf=%{GROUP}\\\\n | iconv -f utf-8 -t koi8-r Communications [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=ru LC_CTYPE=invalid rpmquery -q minicom --qf=%{GROUP}\\\\n | iconv -f utf-8 -t koi8-r Communications [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=de:ru LC_CTYPE=ru_RU.KOI8-R rpmquery -q minicom --qf=%{GROUP}\\\\n Коммуникации [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=de:ru_RU.UTF-8 LC_CTYPE=ru_RU.KOI8-R rpmquery -q minicom --qf=%{GROUP}\\\\n Коммуникации [<a href="mailto:imz@altair" target="_new">imz@altair</a> imz]$ LANGUAGE=de:ru_RU.UTF-8 LC_CTYPE=de_DE.ISO8859-1 rpmquery -q minicom --qf=%{GROUP}\\\\n ???????????? If LC_TYPE is invalid, it outputs the English value. I do not see any important misbahaviour. It seems it was my error to report this bug because I didn\'t understand how all the locale variables work together quiet well. This concerns GROUP. I see misbahaviour in recoding SUMMARY. They should be recoded just as the GROUPs are (and the error messages), I think. A testing script is attached. I run it twice: cat /user/imz/test_i18n_rpm_language.sh | sh cat /user/imz/test_i18n_rpm_language.sh | sed -e \'s!LANGUAGE!LC_MESSAGES!g\' | sh The results are different. Fixed in rpm-4.0.4-alt39 |