Bug 17300 - can't open a file if its name contains strange chars
: can't open a file if its name contains strange chars
Status: CLOSED WONTFIX
: Branch 4.0
(All bugs in Branch 4.0/openoffice.org)
: 4.0
: all Linux
: P2 normal
Assigned To:
:
:
:
:
:
  Show dependency tree
 
Reported: 2008-09-23 17:31 by
Modified: 2008-09-30 17:25 (History)


Attachments
encoding.tar (60.00 KB, application/octet-stream)
2008-09-23 17:31, Ivan Zakharyaschev
no flags Details


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2008-09-23 17:31:08
Created an attachment (id=2945) [details]
encoding.tar

openoffice.org-2.3.1.1-alt4.M40.1 in Lite 4.0.3

I have a with a strange name (I got it via rsync from a system with another
locale). But whatever the name of a file is, a program must open it if it is
given as an argument. This is not the case with OOo:

$ echo *
óÐÉÓÏË.doc
$ ooffice óÐÉÓÏË.doc

An error message appears: "/home/imz/bugreports/encoding/??????.doc не
существует." (The question marks are not ordinary question marks, but question
marks in black diamonds.)

The environment:

$ locale
LANG=ru_RU.UTF-8
LC_CTYPE="ru_RU.UTF-8"
LC_NUMERIC="ru_RU.UTF-8"
LC_TIME="ru_RU.UTF-8"
LC_COLLATE="ru_RU.UTF-8"
LC_MONETARY="ru_RU.UTF-8"
LC_MESSAGES="ru_RU.UTF-8"
LC_PAPER="ru_RU.UTF-8"
LC_NAME="ru_RU.UTF-8"
LC_ADDRESS="ru_RU.UTF-8"
LC_TELEPHONE="ru_RU.UTF-8"
LC_MEASUREMENT="ru_RU.UTF-8"
LC_IDENTIFICATION="ru_RU.UTF-8"
LC_ALL=
$ 

A proof that good programs can open this file:

$ file óÐÉÓÏË.doc 
óÐÉÓÏË.doc: Microsoft Office Document
$ 

(How to reproduce: if you don't know how to create a file with such name, try
the attached .tar.)
------- Comment #1 From 2008-09-23 18:01:24 -------
Looks like sort-of output of 'echo *' is a feature of your terminal emulator
which can interpret invalid UTF-8 characters (you can check it by echo * |
iconv -f UTF-8 -t UTF-8).

Ability to use non-UTF-8 names in UTF-8 locale should be considered as a
feature, and refusal of opening such files should not be treated as a bug.
------- Comment #2 From 2008-09-23 19:42:02 -------
no comments
------- Comment #3 From 2008-09-23 20:47:10 -------
(In reply to comment #1)
> Looks like sort-of output of 'echo *' is a feature of your terminal emulator which can interpret invalid UTF-8 characters (you can check it
> by echo * | iconv -f UTF-8 -t UTF-8).

I don't care how it is displayed, but it's a real path, and it points to an
existing file. If it was invalid, the filesystem should have refused to create
it.

"file" can open it, "abiword" can open it:

$ abiword * -t txt -o a.txt; cat a.txt
a
$ 

but OOo can't. Why? Because the file opening or option parsing code is broken
in OOo.