<?xml version="1.0" encoding="UTF-8" ?>

<bugzilla version="5.2"
          urlbase="https://bugzilla.altlinux.org/"
          
          maintainer="jenya@basealt.ru"
>

    <bug>
          <bug_id>16127</bug_id>
          
          <creation_ts>2008-06-21 13:23:11 +0400</creation_ts>
          <short_desc>broken UTF-8 handling while trimming field length</short_desc>
          <delta_ts>2009-01-22 06:51:52 +0300</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>2</classification_id>
          <classification>Infrastructure</classification>
          <product>Infrastructure</product>
          <component>bugzilla.altlinux.org</component>
          <version>unspecified</version>
          <rep_platform>all</rep_platform>
          <op_sys>Linux</op_sys>
          <bug_status>CLOSED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc>https://bugzilla.altlinux.org/buglist.cgi?query_format=advanced&amp;classification=Development&amp;product=Sisyphus&amp;component=udev&amp;component_type=equals&amp;bug_severity=critical&amp;bug_severity=major&amp;emailassigned_to1=1&amp;emailassigned_to2=1&amp;emailreporter2=1&amp;emailqa_contact2=1&amp;emailcc2=1&amp;chfieldto=Now&amp;cmdtype=doit&amp;order=Reuse%20same%20sort%20as%20last%20time</bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>enhancement</bug_severity>
          <target_milestone>---</target_milestone>
          <dependson>16711</dependson>
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Michael Shigorin">mike</reporter>
          <assigned_to name="Mikhail Gusarov">dottedmag</assigned_to>
          <cc>vitaly.fedrushkov</cc>
          
          <qa_contact name="Mikhail Gusarov">dottedmag</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>72936</commentid>
    <comment_count>0</comment_count>
    <who name="Michael Shigorin">mike</who>
    <bug_when>2008-06-21 13:23:11 +0400</bug_when>
    <thetext>Seems like the field shortening (used at least in buglists) is a bit naїve about multibyte characters and uses byte counts.  This results in Cyrillic strings being cut too early (even if Chinese would get even less hieroglyphs):

udev depends on udev_static-addon instead of udev_static
не все правила отрабатывают пр�...

Second one would also get its last character damaged by being cut in two bytes.

In a perfect world, there might be no sense to cut things at all; but closer to reality, they cut strings preferably on whitespace/punctuation boundaries.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>73504</commentid>
    <comment_count>1</comment_count>
    <who name="Mikhail Gusarov">dottedmag</who>
    <bug_when>2008-07-02 23:54:38 +0400</bug_when>
    <thetext>Yes, Bugzilla does a simple substr() on bytestrings. D&apos;oh.

I can invent a quick hack for the our, UTF-8, Bugzilla, but making it suitable for upstream means a lot of work (essentially converting all the internals from the bytestrings to the Unicode strings :)
</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>82078</commentid>
    <comment_count>2</comment_count>
    <who name="Vitaly Fedrushkov">vitaly.fedrushkov</who>
    <bug_when>2008-12-02 12:26:20 +0300</bug_when>
    <thetext>https://bugzilla.mozilla.org/show_bug.cgi?id=363153 fixed in 3.2</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>84696</commentid>
    <comment_count>3</comment_count>
    <who name="Mikhail Gusarov">dottedmag</who>
    <bug_when>2009-01-22 06:51:52 +0300</bug_when>
    <thetext>Yep, fixed in 3.2.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>