Bug 2281

Summary: Cyrillic symbols are not exported into X primary selection
Product: Sisyphus Reporter: Mikhail Zabaluev <mhz>
Component: emacs21-X11Assignee: Eugene Vlasov <eugvv>
Status: CLOSED FIXED QA Contact: qa-sisyphus
Severity: major    
Priority: P5 CC: viy
Version: unstable   
Hardware: all   
OS: Linux   

Description Mikhail Zabaluev 2003-02-19 02:05:02 MSK
It\'s possible to paste Cyrillic via X selection _into_ an emacs buffer, but when selecting some Cyrillic text in emacs and pasting to other windows all Cyrillic symbols are left out.
emacs-nox works properly with selection.

---

---
emacs-speedbar-0.14-alt0.5.imz1.beta4
emacs-common-21.2-alt14
emacs-nox-21.2-alt14
emacs-leim-21.2-alt14
emacsen-startscripts-0.0.3-alt1
emacs-X11-21.2-alt14

Comment 1 imz 2003-02-20 12:36:02 MSK
It seems to me it wasn\'t like this before. (BTW, the way how the selection is encoded hasn\'t been changed since 21.2-alt4, or even the 21.2-alt1; 21.2-alt4 added more variants for decoding.)

I admit the reason is a change in X server or X libs: 

XFree86-4.2.1.1-alt3
XFree86-server-4.2.1.1-alt3

(I can reproduce this).

I\'m working in ru_RU.KOI8-R and can reproduce the described behaviour.

However, pastimg Emacs selection into an xterm started like this:

LANG=ru_RU.ISO8859-5 xterm

results in some Russian letters (not an empty place). If I start it like this:

LANG=ru_RU.ISO8859-5 xterm  -fn \'-*-iso8859-5\'

I can even see the correct Russian words. (The same for LANG=ru_RU.ISO8859-5 Eterm  -F \'-*-iso8859-5\'.)

The described behaviour looks similar to the behaviour of X when it doesn\'t pass through Cyrillic letters entered on keyboard to a client which works in a \&quot;non-Cyrilic\&quot; locale. E.g., trying to type Russian in a

LANG=C xterm

results in an empty place. (The same for LANG=C emacs which is even more disconvenient: we know that Emacs is able to handle Cyrillic in any locale.)

Probably, we should test this on other XFree86 versions, and examine the changes that have been done inn XFree86. It looks like a misbehaviour of X which doesn\'t want to accept an ISO8859-5 encoded selection suitable for pasting into a KOI8-R client.
Comment 2 imz 2003-02-20 12:36:02 MSK
It seems to me it wasn\'t like this before. (BTW, the way how the selection is encoded hasn\'t been changed since 21.2-alt4, or even the 21.2-alt1; 21.2-alt4 added more variants for decoding.)

I admit the reason is a change in X server or X libs: 

XFree86-4.2.1.1-alt3
XFree86-server-4.2.1.1-alt3

(I can reproduce this).

I\'m working in ru_RU.KOI8-R and can reproduce the described behaviour.

However, pastimg Emacs selection into an xterm started like this:

LANG=ru_RU.ISO8859-5 xterm

results in some Russian letters (not an empty place). If I start it like this:

LANG=ru_RU.ISO8859-5 xterm  -fn \'-*-iso8859-5\'

I can even see the correct Russian words. (The same for LANG=ru_RU.ISO8859-5 Eterm  -F \'-*-iso8859-5\'.)

The described behaviour looks similar to the behaviour of X when it doesn\'t pass through Cyrillic letters entered on keyboard to a client which works in a \&quot;non-Cyrilic\&quot; locale. E.g., trying to type Russian in a

LANG=C xterm

results in an empty place. (The same for LANG=C emacs which is even more disconvenient: we know that Emacs is able to handle Cyrillic in any locale.)

Probably, we should test this on other XFree86 versions, and examine the changes that have been done inn XFree86. It looks like a misbehaviour of X which doesn\'t want to accept an ISO8859-5 encoded selection suitable for pasting into a KOI8-R client.
Comment 3 Mikhail Zabaluev 2003-02-20 19:05:43 MSK
Why a selection comes off Emacs in ISO8859-5 whereas buffers are in KOI8-R by default? Is there a variable in Emacs to control the outgoing selection encoding?
Comment 4 Mikhail Zabaluev 2003-02-20 19:05:43 MSK
Why a selection comes off Emacs in ISO8859-5 whereas buffers are in KOI8-R by default? Is there a variable in Emacs to control the outgoing selection encoding?
Comment 5 imz 2003-02-20 20:26:55 MSK
ISO8859-5 is the standard encoding for Cyrillic in X Compound Text described in ctext.ps. KOI8-R can be used only as an extension in the Compound Text.

I\'m not sure yet what happens, so I can\'t give precise explanations.

Emacs can decode any Compound Text it receives (with extension in KOI8-R, CP1251, KOI8-U), but AFAIK when it encodes, it uses the standard ways. Internally, it stores the buffer content in a complex multibyte encoding, so it doesn\'t matter for it whether the buffer (actaully, even the file) the selection comes from is in KOI8-R or any other encoding.
Comment 6 imz 2003-02-20 20:26:55 MSK
ISO8859-5 is the standard encoding for Cyrillic in X Compound Text described in ctext.ps. KOI8-R can be used only as an extension in the Compound Text.

I\'m not sure yet what happens, so I can\'t give precise explanations.

Emacs can decode any Compound Text it receives (with extension in KOI8-R, CP1251, KOI8-U), but AFAIK when it encodes, it uses the standard ways. Internally, it stores the buffer content in a complex multibyte encoding, so it doesn\'t matter for it whether the buffer (actaully, even the file) the selection comes from is in KOI8-R or any other encoding.
Comment 7 imz 2003-02-20 20:42:41 MSK
There is a variable, and it has the right default value (to be able to transmit multilingual texts):

selection-coding-system\'s value is 
compound-text-with-extensions

Documentation:
Coding system for communicating with other X clients.
When sending or receiving text via cut_buffer, selection, and clipboard,
the text is encoded or decoded by this coding system.
The default value is `compound-text-with-extensions\'.

The encoder for compound-text-with-extensions encodes Cyrillic so that it becomes an empty place in some other KOI8-R clients.
Comment 8 imz 2003-02-20 20:42:41 MSK
There is a variable, and it has the right default value (to be able to transmit multilingual texts):

selection-coding-system\'s value is 
compound-text-with-extensions

Documentation:
Coding system for communicating with other X clients.
When sending or receiving text via cut_buffer, selection, and clipboard,
the text is encoded or decoded by this coding system.
The default value is `compound-text-with-extensions\'.

The encoder for compound-text-with-extensions encodes Cyrillic so that it becomes an empty place in some other KOI8-R clients.
Comment 9 Mikhail Zabaluev 2003-02-21 17:41:08 MSK
C-x RET x has nothing to do with it, has it? Anyway, setting it to koi8-r doesn\'t help now.
Comment 10 Mikhail Zabaluev 2003-02-21 17:41:08 MSK
C-x RET x has nothing to do with it, has it? Anyway, setting it to koi8-r doesn\'t help now.
Comment 11 imz 2003-02-21 21:09:47 MSK
It has. It checks the coding-system value and sets the variable.

Setting it to x-ctext-with-extensions and compound-text-witgh-extension (the default) causes different behaviour, although these coding-system names are synonyms. This alone seems to be a bug.

Second, with the default setting (compound-text-witgh-extension) pasting the primary selection with the middle mouse button to an xclipboard generates a message about an ILLEGAL selection (ru_RU.KOI8-R environment):

$ xclipboard 
Xaw Text Widget: An attempt was made to insert an illegal selection.

Pasting in the same state into a

LANG=ru_RU.ISO8859-5 xclipboard

doesn\'t lead to an error. Probably, this also means there\'s a bug in Emacs.

With x-ctext-with-extensions, there is no such error in both cases.




I could reproduce the originally described behaviour on XFree-4.1.0.
Comment 12 imz 2003-02-21 21:09:47 MSK
It has. It checks the coding-system value and sets the variable.

Setting it to x-ctext-with-extensions and compound-text-witgh-extension (the default) causes different behaviour, although these coding-system names are synonyms. This alone seems to be a bug.

Second, with the default setting (compound-text-witgh-extension) pasting the primary selection with the middle mouse button to an xclipboard generates a message about an ILLEGAL selection (ru_RU.KOI8-R environment):

$ xclipboard 
Xaw Text Widget: An attempt was made to insert an illegal selection.

Pasting in the same state into a

LANG=ru_RU.ISO8859-5 xclipboard

doesn\'t lead to an error. Probably, this also means there\'s a bug in Emacs.

With x-ctext-with-extensions, there is no such error in both cases.




I could reproduce the originally described behaviour on XFree-4.1.0.
Comment 13 Mikhail Zabaluev 2003-02-27 13:33:35 MSK
x-ctext-with-extensions doesn\'t help me either.
From a selection containing both Latin and Cyrillic symbols,
gnome-terminal doesn\'t paste anything, xterm and rxvt paste Cyrillic in an incorrect encoding.
Comment 14 Mikhail Zabaluev 2003-02-27 13:33:35 MSK
x-ctext-with-extensions doesn\'t help me either.
From a selection containing both Latin and Cyrillic symbols,
gnome-terminal doesn\'t paste anything, xterm and rxvt paste Cyrillic in an incorrect encoding.
Comment 15 inger@altlinux.org 2004-04-29 12:36:05 MSD
переназначено 
 
Comment 16 viy 2005-11-20 23:01:15 MSK
I still see it in 21.4-alt4
Comment 17 Eugene Vlasov 2005-11-23 16:29:36 MSK
Исправлено в emacs22. Ждем выхода release candidate - будет в сизифе.
Comment 18 Alex Ott 2005-11-23 20:48:41 MSK
а есть предварительные сборки с правильными obsoletes?
хотелось бы видеть правильно распиленную версию, так чтобы можно было подменять
некоторые пакеты пакетами из соответствующих cvs-версий или новых релизов. это
то, как сделано с gnus
Comment 19 Eugene Vlasov 2005-11-23 22:59:32 MSK
(In reply to comment #18)
> а есть предварительные сборки с правильными obsoletes?

Сборки - в Daedalus (скоро там должен появится новый снапшот), но obsoletes в
них пока нет. Я, честно говоря, пока вообще не решил, будем ли мы обсолетить
emacs21, и, если будем - будет ли это сразу. Пока только conflicts.

> хотелось бы видеть правильно распиленную версию, так чтобы можно было подменять
> некоторые пакеты пакетами из соответствующих cvs-версий или новых релизов. это
> то, как сделано с gnus

Пока ничего не распилено. Я думал над этим и решил не выделять в отдельные
пакеты всякую мелочь навроде ses, по крайней мере, пока в этом не возникнет
необходимость. Вот только speedbar, наверное, сделаю отдельно, на будущее -
возможно выделю tramp (и url?). 
Можете посоветовать выделить что-то еще?

Comment 20 Eugene Vlasov 2006-02-10 15:12:11 MSK
Для текущего emacs22 из Сизифа уже не актуально.