Bug 17300 - can't open a file if its name contains strange chars
Summary: can't open a file if its name contains strange chars
Status: CLOSED WONTFIX
Alias: None
Product: Branch 4.0
Classification: Distributions
Component: openoffice.org (show other bugs)
Version: 4.0
Hardware: all Linux
: P2 normal
Assignee: Valery Inozemtsev
QA Contact: Q.A. 4.0
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-09-23 17:31 MSD by Ivan Zakharyaschev
Modified: 2008-09-30 17:25 MSD (History)
2 users (show)

See Also:


Attachments
encoding.tar (60.00 KB, application/octet-stream)
2008-09-23 17:31 MSD, Ivan Zakharyaschev
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ivan Zakharyaschev 2008-09-23 17:31:08 MSD
Created attachment 2945 [details]
encoding.tar

openoffice.org-2.3.1.1-alt4.M40.1 in Lite 4.0.3

I have a with a strange name (I got it via rsync from a system with another locale). But whatever the name of a file is, a program must open it if it is given as an argument. This is not the case with OOo:

$ echo *
óÐÉÓÏË.doc
$ ooffice óÐÉÓÏË.doc

An error message appears: "/home/imz/bugreports/encoding/??????.doc не существует." (The question marks are not ordinary question marks, but question marks in black diamonds.)

The environment:

$ locale
LANG=ru_RU.UTF-8
LC_CTYPE="ru_RU.UTF-8"
LC_NUMERIC="ru_RU.UTF-8"
LC_TIME="ru_RU.UTF-8"
LC_COLLATE="ru_RU.UTF-8"
LC_MONETARY="ru_RU.UTF-8"
LC_MESSAGES="ru_RU.UTF-8"
LC_PAPER="ru_RU.UTF-8"
LC_NAME="ru_RU.UTF-8"
LC_ADDRESS="ru_RU.UTF-8"
LC_TELEPHONE="ru_RU.UTF-8"
LC_MEASUREMENT="ru_RU.UTF-8"
LC_IDENTIFICATION="ru_RU.UTF-8"
LC_ALL=
$ 

A proof that good programs can open this file:

$ file óÐÉÓÏË.doc 
óÐÉÓÏË.doc: Microsoft Office Document
$ 

(How to reproduce: if you don't know how to create a file with such name, try the attached .tar.)
Comment 1 Mikhail Gusarov 2008-09-23 18:01:24 MSD
Looks like sort-of output of 'echo *' is a feature of your terminal emulator which can interpret invalid UTF-8 characters (you can check it by echo * | iconv -f UTF-8 -t UTF-8).

Ability to use non-UTF-8 names in UTF-8 locale should be considered as a feature, and refusal of opening such files should not be treated as a bug.
Comment 2 Valery Inozemtsev 2008-09-23 19:42:02 MSD
no comments
Comment 3 Ivan Zakharyaschev 2008-09-23 20:47:10 MSD
(In reply to comment #1)
> Looks like sort-of output of 'echo *' is a feature of your terminal emulator which can interpret invalid UTF-8 characters (you can check it
> by echo * | iconv -f UTF-8 -t UTF-8).

I don't care how it is displayed, but it's a real path, and it points to an existing file. If it was invalid, the filesystem should have refused to create it.

"file" can open it, "abiword" can open it:

$ abiword * -t txt -o a.txt; cat a.txt
a
$ 

but OOo can't. Why? Because the file opening or option parsing code is broken in OOo.