<?xml version="1.0" encoding="UTF-8" ?>

<bugzilla version="5.2"
          urlbase="https://bugzilla.altlinux.org/"
          
          maintainer="jenya@basealt.ru"
>

    <bug>
          <bug_id>39710</bug_id>
          
          <creation_ts>2021-02-21 14:42:47 +0300</creation_ts>
          <short_desc>emits gravely misaligned loops</short_desc>
          <delta_ts>2021-02-21 14:42:47 +0300</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>4</classification_id>
          <classification>Development</classification>
          <product>Sisyphus</product>
          <component>gcc10</component>
          <version>unstable</version>
          <rep_platform>x86_64</rep_platform>
          <op_sys>Linux</op_sys>
          <bug_status>NEW</bug_status>
          <resolution></resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P5</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter>alexey.tourbin</reporter>
          <assigned_to name="Gleb F-Malinovskiy">glebfm</assigned_to>
          <cc>glebfm</cc>
    
    <cc>ldv</cc>
          
          <qa_contact>qa-sisyphus</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>196420</commentid>
    <comment_count>0</comment_count>
      <attachid>9208</attachid>
    <who name="">alexey.tourbin</who>
    <bug_when>2021-02-21 14:42:47 +0300</bug_when>
    <thetext>Created attachment 9208
to reproduce misaligned loop

Мужчины, надеюсь что импортонезависимость вашего дистрибутива день ото дня крепчает.

Я написал тугой цикл, который должен выполняться со скоростью 2.5 цикла на итерацию, а он выполняется со скоростью 3.5 цикла на итерацию.  Я некоторое время чесал репу, пока не заметил, что компилятор выравнивает этот цикл на 12 mod 16, и даже более того 28 mod 32, и даже более того 60 mod 64!

Приложил файл bench-patch.c (из своего репозитория pfor16.git), на котором можно воспроизвести невыровненный цикл.

    while (e1--) {
        unsigned w = Bload16le(src);
        unsigned i = w &amp; Mask(7);
        unsigned x = w &gt;&gt; 7;
        src += 2;
        v[i] += x &lt;&lt; m;
    }

$ gcc -DPatch=patch128 -DM=2 -g -O2 -Wall bench-patch.c &amp;&amp;
objdump -S ./a.out | sed -n &apos;/while (e1--)/,/AGGR/p&apos;
    while (e1--) {
    12b1:       4c 89 c2                mov    %r8,%rdx
    12b4:       85 c9                   test   %ecx,%ecx
    12b6:       0f 84 84 00 00 00       je     1340 &lt;bench+0x1e0&gt;
    12bc:       0f b7 02                movzwl (%rdx),%eax
    12bf:       48 83 c2 02             add    $0x2,%rdx
    12c3:       49 89 c6                mov    %rax,%r14
    12c6:       c1 e8 05                shr    $0x5,%eax
    12c9:       41 83 e6 7f             and    $0x7f,%r14d
    12cd:       25 fc 07 00 00          and    $0x7fc,%eax
    12d2:       66 42 01 04 76          add    %ax,(%rsi,%r14,2)
    12d7:       4c 39 fa                cmp    %r15,%rdx
    12da:       75 e0                   jne    12bc &lt;bench+0x15c&gt;

Мужчины, &quot;jne 12bc&quot; - это как такое выравнивание называется?  Это выравнивание называется последние четыре байта кеш-линии.  Что-то не так у вас в консерватории.

$ gcc --version
x86_64-alt-linux-gcc (GCC) 10.2.1 20201125 (ALT Sisyphus 10.2.1-alt2)</thetext>
  </long_desc>
      
          <attachment
              isobsolete="0"
              ispatch="0"
              isprivate="0"
          >
            <attachid>9208</attachid>
            <date>2021-02-21 14:42:47 +0300</date>
            <delta_ts>2021-02-21 14:42:47 +0300</delta_ts>
            <desc>to reproduce misaligned loop</desc>
            <filename>bench-patch.c</filename>
            <type>text/x-csrc</type>
            <size>4801</size>
            <attacher>alexey.tourbin</attacher>
            
              <data encoding="base64">Ly8gQ29weXJpZ2h0IChjKSAyMDIxIEFsZXhleSBUb3VyYmluCi8vCi8vIFBlcm1pc3Npb24gaXMg
aGVyZWJ5IGdyYW50ZWQsIGZyZWUgb2YgY2hhcmdlLCB0byBhbnkgcGVyc29uIG9idGFpbmluZyBh
IGNvcHkKLy8gb2YgdGhpcyBzb2Z0d2FyZSBhbmQgYXNzb2NpYXRlZCBkb2N1bWVudGF0aW9uIGZp
bGVzICh0aGUgIlNvZnR3YXJlIiksIHRvIGRlYWwKLy8gaW4gdGhlIFNvZnR3YXJlIHdpdGhvdXQg
cmVzdHJpY3Rpb24sIGluY2x1ZGluZyB3aXRob3V0IGxpbWl0YXRpb24gdGhlIHJpZ2h0cwovLyB0
byB1c2UsIGNvcHksIG1vZGlmeSwgbWVyZ2UsIHB1Ymxpc2gsIGRpc3RyaWJ1dGUsIHN1YmxpY2Vu
c2UsIGFuZC9vciBzZWxsCi8vIGNvcGllcyBvZiB0aGUgU29mdHdhcmUsIGFuZCB0byBwZXJtaXQg
cGVyc29ucyB0byB3aG9tIHRoZSBTb2Z0d2FyZSBpcwovLyBmdXJuaXNoZWQgdG8gZG8gc28sIHN1
YmplY3QgdG8gdGhlIGZvbGxvd2luZyBjb25kaXRpb25zOgovLwovLyBUaGUgYWJvdmUgY29weXJp
Z2h0IG5vdGljZSBhbmQgdGhpcyBwZXJtaXNzaW9uIG5vdGljZSBzaGFsbCBiZSBpbmNsdWRlZCBp
bgovLyBhbGwgY29waWVzIG9yIHN1YnN0YW50aWFsIHBvcnRpb25zIG9mIHRoZSBTb2Z0d2FyZS4K
Ly8KLy8gVEhFIFNPRlRXQVJFIElTIFBST1ZJREVEICJBUyBJUyIsIFdJVEhPVVQgV0FSUkFOVFkg
T0YgQU5ZIEtJTkQsIEVYUFJFU1MgT1IKLy8gSU1QTElFRCwgSU5DTFVESU5HIEJVVCBOT1QgTElN
SVRFRCBUTyBUSEUgV0FSUkFOVElFUyBPRiBNRVJDSEFOVEFCSUxJVFksCi8vIEZJVE5FU1MgRk9S
IEEgUEFSVElDVUxBUiBQVVJQT1NFIEFORCBOT05JTkZSSU5HRU1FTlQuICBJTiBOTyBFVkVOVCBT
SEFMTCBUSEUKLy8gQVVUSE9SUyBPUiBDT1BZUklHSFQgSE9MREVSUyBCRSBMSUFCTEUgRk9SIEFO
WSBDTEFJTSwgREFNQUdFUyBPUiBPVEhFUgovLyBMSUFCSUxJVFksIFdIRVRIRVIgSU4gQU4gQUNU
SU9OIE9GIENPTlRSQUNULCBUT1JUIE9SIE9USEVSV0lTRSwgQVJJU0lORyBGUk9NLAovLyBPVVQg
T0YgT1IgSU4gQ09OTkVDVElPTiBXSVRIIFRIRSBTT0ZUV0FSRSBPUiBUSEUgVVNFIE9SIE9USEVS
IERFQUxJTkdTIElOIFRIRQovLyBTT0ZUV0FSRS4KCiNpbmNsdWRlIDxzdGRpby5oPgojaW5jbHVk
ZSA8c3RkaW50Lmg+CiNpbmNsdWRlIDxzdHJpbmcuaD4KCnN0YXRpYyBpbmxpbmUgdWludDE2X3Qg
QmxvYWQxNmxlKGNvbnN0IHZvaWQgKnApCnsKICAgIHVpbnQxNl90IHg7CiAgICBtZW1jcHkoJngs
IHAsIDIpOwogICAgcmV0dXJuIHg7Cn0KCiNpZiBkZWZpbmVkKF9faTM4Nl9fKSB8fCBkZWZpbmVk
KF9feDg2XzY0X18pCiNpbmNsdWRlIDx4ODZpbnRyaW4uaD4KI2RlZmluZSByZHRzYygpIF9fcmR0
c2MoKQojZWxzZQpzdGF0aWMgaW5saW5lIHVpbnQ2NF90IHJkdHNjKHZvaWQpCnsKICAgIHVpbnQ2
NF90IHQ7CiNpZiBkZWZpbmVkKF9fYWFyY2g2NF9fKQogICAgYXNtIHZvbGF0aWxlKCJtcnMgJTAs
IGNudHZjdF9lbDAiIDogIj1yIih0KSk7CiNlbGlmIGRlZmluZWQoX19wb3dlcnBjNjRfXykKICAg
IGFzbSB2b2xhdGlsZSgibWZzcHIgJTAsIDI2OCIgOiAiPXIiKHQpKTsKI2Vsc2UKI2Vycm9yICJy
ZHRzYyBub3Qgc3VwcG9ydGVkIgojZW5kaWYKICAgIHJldHVybiB0Owp9CiNlbmRpZgoKI2RlZmlu
ZSBNYXNrKGspICgoMVU8PChrKSktMSkKCnN0YXRpYyBpbmxpbmUgY29uc3QgdW5zaWduZWQgY2hh
ciAqcGF0Y2g2NChjb25zdCB1bnNpZ25lZCBjaGFyICpzcmMsIHVpbnQxNl90ICp2LAoJaW50IG0s
IHVuc2lnbmVkIGUwLCB1bnNpZ25lZCBlMSkKewogICAgd2hpbGUgKGUwLS0pIHsKCXVuc2lnbmVk
IHcwID0gQmxvYWQxNmxlKHNyYyArIDApOwoJdW5zaWduZWQgdzEgPSBCbG9hZDE2bGUoc3JjICsg
MSk7Cgl1bnNpZ25lZCBpMCA9IHcwICYgTWFzayg2KTsKCXVuc2lnbmVkIGkxID0gdzEgPj4gMTA7
Cgl1bnNpZ25lZCB4MCA9ICh3MCA+PiA2KSAmIE1hc2soNik7Cgl1bnNpZ25lZCB4MSA9ICh3MSA+
PiA0KSAmIE1hc2soNik7CglzcmMgKz0gMzsKCXZbaTBdICs9IHgwIDw8IG07Cgl2W2kxXSArPSB4
MSA8PCBtOwogICAgfQogICAgd2hpbGUgKGUxLS0pIHsKCXVuc2lnbmVkIHcgPSBCbG9hZDE2bGUo
c3JjKTsKCXVuc2lnbmVkIGkgPSB3ICYgTWFzayg2KTsKCXVuc2lnbmVkIHggPSB3ID4+IDY7Cglz
cmMgKz0gMjsKCXZbaV0gKz0geCA8PCBtOwogICAgfQogICAgcmV0dXJuIHNyYzsKfQoKc3RhdGlj
IGlubGluZSBjb25zdCB1bnNpZ25lZCBjaGFyICpwYXRjaDEyOChjb25zdCB1bnNpZ25lZCBjaGFy
ICpzcmMsIHVpbnQxNl90ICp2LAoJaW50IG0sIHVuc2lnbmVkIGUwLCB1bnNpZ25lZCBlMSkKewog
ICAgd2hpbGUgKGUwLS0pIHsKCXVuc2lnbmVkIHcwID0gQmxvYWQxNmxlKHNyYyArIDApOwoJdW5z
aWduZWQgdzEgPSBCbG9hZDE2bGUoc3JjICsgMSk7Cgl1bnNpZ25lZCBpMCA9IHcwICYgTWFzayg3
KTsKCXVuc2lnbmVkIGkxID0gdzEgPj4gOTsKCXVuc2lnbmVkIHgwID0gKHcwID4+IDcpICYgTWFz
ayg1KTsKCXVuc2lnbmVkIHgxID0gKHcxID4+IDQpICYgTWFzayg1KTsKCXNyYyArPSAzOwoJdltp
MF0gKz0geDAgPDwgbTsKCXZbaTFdICs9IHgxIDw8IG07CiAgICB9CiAgICB3aGlsZSAoZTEtLSkg
ewoJdW5zaWduZWQgdyA9IEJsb2FkMTZsZShzcmMpOwoJdW5zaWduZWQgaSA9IHcgJiBNYXNrKDcp
OwoJdW5zaWduZWQgeCA9IHcgPj4gNzsKCXNyYyArPSAyOwoJdltpXSArPSB4IDw8IG07CiAgICB9
CiAgICByZXR1cm4gc3JjOwp9CgpzdGF0aWMgaW5saW5lIGNvbnN0IHVuc2lnbmVkIGNoYXIgKnBh
dGNoMjU2KGNvbnN0IHVuc2lnbmVkIGNoYXIgKnNyYywgdWludDE2X3QgKnYsCglpbnQgbSwgdW5z
aWduZWQgZTAsIHVuc2lnbmVkIGUxKQp7CiAgICB3aGlsZSAoZTAtLSkgewoJdW5zaWduZWQgdzAg
PSBCbG9hZDE2bGUoc3JjICsgMCk7Cgl1bnNpZ25lZCB3MSA9IEJsb2FkMTZsZShzcmMgKyAxKTsK
CXVuc2lnbmVkIGkwID0gKHVpbnQ4X3QpIHcwOwoJdW5zaWduZWQgaTEgPSB3MSA+PiA4OwoJdW5z
aWduZWQgeDAgPSAodzAgPj4gOCkgJiBNYXNrKDQpOwoJdW5zaWduZWQgeDEgPSAodzEgPj4gNCkg
JiBNYXNrKDQpOwoJc3JjICs9IDM7Cgl2W2kwXSArPSB4MCA8PCBtOwoJdltpMV0gKz0geDEgPDwg
bTsKICAgIH0KICAgIHdoaWxlIChlMS0tKSB7Cgl1bnNpZ25lZCBpID0gc3JjWzBdOwoJdW5zaWdu
ZWQgeCA9IHNyY1sxXTsKCXNyYyArPSAyOwoJdltpXSArPSB4IDw8IG07CiAgICB9CiAgICByZXR1
cm4gc3JjOwp9CgovLyBMZWhtZXIgcmFuZG9tIG51bWJlciBnZW5lcmF0b3IKc3RhdGljIGlubGlu
ZSB1aW50MzJfdCByYW5kMzIodm9pZCkKewojZGVmaW5lIFIzMksgVUlOVDY0X0MoNjM2NDEzNjIy
Mzg0Njc5MzAwNSkKICAgIHN0YXRpYyB1aW50NjRfdCByYW5kMzJzdGF0ZSA9IFIzMks7CiAgICB1
aW50MzJfdCByZXQgPSByYW5kMzJzdGF0ZSA+PiAzMjsKICAgIHJhbmQzMnN0YXRlICo9IFIzMks7
CiAgICByZXR1cm4gcmV0Owp9CgojaWZuZGVmIE0KI2RlZmluZSBNIDUKI2VuZGlmCgojaWZuZGVm
IFBhdGNoCiNkZWZpbmUgUGF0Y2ggcGF0Y2gyNTYKI2VuZGlmCgojaWZuZGVmIEFHR1IKI2RlZmlu
ZSBBR0dSIDAKI2VuZGlmCgojaWZuZGVmIF9fY2xhbmdfXwpfX2F0dHJpYnV0ZV9fKChub2lwYSkp
CiNlbmRpZgpfX2F0dHJpYnV0ZV9fKChub2lubGluZSkpCnZvaWQgYmVuY2godm9pZCkKewogICAg
dWludDE2X3QgdlsyNTZdOwogICAgdW5zaWduZWQgY2hhciBzcmNbMzJdOwogICAgZm9yIChpbnQg
aSA9IDA7IGkgPCAyNTY7IGkrKykKCXZbaV0gPSByYW5kMzIoKTsKICAgIGZvciAoaW50IGkgPSAw
OyBpIDwgMzI7IGkrKykKCXNyY1tpXSA9IHJhbmQzMigpOwoKICAgIHVpbnQ2NF90IHRbMl1bMzJd
OwogICAgdWludDMyX3QgblsyXVszMl07CiAgICBtZW1zZXQodCwgMCwgc2l6ZW9mIHQpOwogICAg
bWVtc2V0KG4sIDAsIHNpemVvZiBuKTsKCiAgICB1aW50NjRfdCB0c3VtID0gMDsKICAgIGlmIChB
R0dSKQoJdHN1bSA9IHJkdHNjKCk7CiAgICBmb3IgKGludCBpID0gMDsgaSA8ICgxPDwyOCk7IGkr
KykgewoJdWludDMyX3QgciA9IHJhbmQzMigpOwoJdWludDMyX3QgZTAgPSAociA+PiAwKSAlIDMy
OwoJdWludDMyX3QgZTEgPSAociA+PiA5KSAlIDMyOwoJdWludDY0X3QgdDAsIHQxLCB0MjsKCWlm
ICghQUdHUikgdDAgPSByZHRzYygpOwoJUGF0Y2goc3JjLCB2LCBNLCBlMCwgMCk7CglpZiAoIUFH
R1IpIHQxID0gcmR0c2MoKTsKCVBhdGNoKHNyYywgdiwgTSwgMCwgZTEpOwoJaWYgKCFBR0dSKSB0
MiA9IHJkdHNjKCk7CglpZiAoIUFHR1IpIHRbMF1bZTBdICs9IHQxIC0gdDAsIG5bMF1bZTBdKys7
CglpZiAoIUFHR1IpIHRbMV1bZTFdICs9IHQyIC0gdDEsIG5bMV1bZTFdKys7CglpZiAoIUFHR1Ip
IHRzdW0gKz0gdDIgLSB0MDsKICAgIH0KICAgIGlmIChBR0dSKQoJdHN1bSA9IHJkdHNjKCkgLSB0
c3VtOwogICAgaWYgKCFBR0dSKSBmb3IgKGludCBpID0gMDsgaSA8IDMyOyBpKyspCglmcHJpbnRm
KHN0ZGVyciwgImUwPSVkICUuMWYgY3ljbGVzXG4iLCBpLCAoZG91YmxlKSB0WzBdW2ldIC8gblsw
XVtpXSk7CiAgICBpZiAoIUFHR1IpIGZvciAoaW50IGkgPSAwOyBpIDwgMzI7IGkrKykKCWZwcmlu
dGYoc3RkZXJyLCAiZTE9JWQgJS4xZiBjeWNsZXNcbiIsIGksIChkb3VibGUpIHRbMV1baV0gLyBu
WzFdW2ldKTsKICAgIGZwcmludGYoc3RkZXJyLCAiYXZnICUuMWYgY3ljbGVzXG4iLCAoZG91Ymxl
KSB0c3VtIC8gKDIgKiAoMTw8MjgpKSk7Cn0KCmludCBtYWluKCkKewogICAgYmVuY2goKTsKICAg
IHJldHVybiAwOwp9Cg==
</data>

          </attachment>
      

    </bug>

</bugzilla>