For now make-initrd generates non-reproducible initrd images when it is running on the same environment (the same kernel, the same installed packages, the same configuration, etc). Initrd image builds within one environment should be reproducible.
Please explain what you mean when you say "reproducible" ?
For now I got: # make-initrd ... # b2sum /boot/initrd-std-def.img fe1a5a2fad9a6943982a5a88931c58aa1a753ea3b3287b13f275667d05972c2d75e74d107b133a81d415dd08d1e93ebd83d775fd4ae91ead362c82c62a4fd782 /boot/initrd-std-def.img Let's regenerate it: # make-initrd ... # b2sum /boot/initrd-std-def.img 03fa3052bab13ae8b566bca21e67f9bc318a3a9924990172041eafc2fa3d565832046af018f06e34574d1ae4ccfda17b24c13eda81353740b19804e3c32bd5ea /boot/initrd-std-def.img The images differ. Its content may be the same but it can be sorted differently, or it caused by some timestamp, or some other causes.
About the order of the files, I probably agree. This might be a good thing. But I disagree about the same timestamp. I see no reason to keep them.
I looked at the code for a bit. Some features generate new files for initramfs image and not just copy existing ones from the system. In this case, saving the timestamp is not possible without the presence of a previous version of initramfs. With that said, I don't think it is possible to implement reproducible initramfs images.
> I looked at the code for a bit. Some features generate new files for initramfs image and not just copy existing ones from the system. Sure, but the timestamp could be faked. Many projects fake timestamps to achieve the goal of reproducibility. While many of them use SOURCE_DATE_EPOCH [1] (make-initrd could support it too), I propose to take mtime of the kernel image file for which initrd image is generated. [1] https://reproducible-builds.org/docs/source-date-epoch/
Of course I could fake timestamp wherever files are created or copied. But the fact is that the code is not ready for this and I see no point in doing it for no good reason. You have not provided a good rationale for this. Moreover it seems to me wrong to use timestamp of kernel for initramfs content. They are not related in any way. Why would they be the same? We can use 1-1-1970 or any other random date for content with the same effect.
(In reply to Alexey Gladkov from comment #6) > Of course I could fake timestamp wherever files are created or copied. But > the fact is that the code is not ready for this and I see no point in doing > it for no good reason. You have not provided a good rationale for this. Reproducibility is valuable by itself, but here's more practical reason: to make installation and bootable images that contain initrd reproducible that anyone can easily verify it was not infected by some side software during the build. There are many steps to achieve this but it is not possible without reproducible initrd images. > Moreover it seems to me wrong to use timestamp of kernel for initramfs > content. They are not related in any way. Why would they be the same? We can > use 1-1-1970 or any other random date for content with the same effect. Sure *they are* related in some way: you build initrd image for the particular kernel, and usually it contains some modules for that kernel. You can try to load some kernel with an initrd built for another one but it has a little sense: the modules packed in that initrd most likely cannot be loaded with this kernel. Anyway we need to chose some timestamp, and at last there is some reasoning for the proposed one.
(Ответ для Vladimir D. Seleznev на комментарий #7) > Reproducibility is valuable by itself, but here's more practical reason: to > make installation and bootable images that contain initrd reproducible that > anyone can easily verify it was not infected by some side software during > the build. There are many steps to achieve this but it is not possible > without reproducible initrd images. To verify that initramfs was not infected you need to get checksum after _each_ initramfs creation and check it every boot. This checksum will be the same between rebuilds. Reproducibility has nothing to do with this problem. I still don't hear any arguments why implement this. I'm still not convinced of the need to implement it. Please give me a real life usecase that cannot be solved without reproducible initramfs. > Sure *they are* related in some way: you build initrd image for the > particular kernel, and usually it contains some modules for that kernel. You > can try to load some kernel with an initrd built for another one but it has > a little sense: the modules packed in that initrd most likely cannot be > loaded with this kernel. Anyway we need to chose some timestamp, and at last > there is some reasoning for the proposed one. No, they are not related. Or you don't understand the essence of ctime/mtime. We create another new image with _new_ files (not only copied from the system). ctime should reflect the actual creation time and not mislead that the initramfs was created when the kernel was compiled. That's just wrong. The more I think about it, the less I like this idea in general. BTW I have feature request to generate initramfs with modules for multiple kernels. In this case, it is not at all clear which timestamp to use. I definitely won't use the kernel as a source of timestamp. P.S. Please stop reopen this bug until you convince me or I'll just stop responding.
(In reply to Alexey Gladkov from comment #8) > (Ответ для Vladimir D. Seleznev на комментарий #7) > > Reproducibility is valuable by itself, but here's more practical reason: to > > make installation and bootable images that contain initrd reproducible that > > anyone can easily verify it was not infected by some side software during > > the build. There are many steps to achieve this but it is not possible > > without reproducible initrd images. > > To verify that initramfs was not infected you need to get checksum after > _each_ initramfs creation and check it every boot. This checksum will be the > same between rebuilds. Reproducibility has nothing to do with this problem. No, you have been misleaded: I've talked about reproducibility of installation and bootable images which cannot be achieved without initrd image reproducibility. In fact, if someone could infect my initrd, she probably could modify my checksum prog that I could not notice that infection. The reproducibility of installation images, on the other hand, is very convenient to test different build environments for suspicious side effects, for example. > I still don't hear any arguments why implement this. I'm still not convinced > of the need to implement it. I gave it above. > Please give me a real life usecase that cannot be solved without > reproducible initramfs. For now you cannot build reproducible bootable/installation image because you also need to build initramfs for it. > > Sure *they are* related in some way: you build initrd image for the > > particular kernel, and usually it contains some modules for that kernel. You > > can try to load some kernel with an initrd built for another one but it has > > a little sense: the modules packed in that initrd most likely cannot be > > loaded with this kernel. Anyway we need to chose some timestamp, and at last > > there is some reasoning for the proposed one. > > No, they are not related. Or you don't understand the essence of > ctime/mtime. We create another new image with _new_ files (not only copied > from the system). ctime should reflect the actual creation time and not > mislead that the initramfs was created when the kernel was compiled. That's > just wrong. The more I think about it, the less I like this idea in general. To be clear: I'm not talking about ctime/mtime of the initramfs itself, it's all about timestamps that are packed inside of it. The [cm]time of initramfs do not relate to the reproducibility. > BTW I have feature request to generate initramfs with modules for multiple > kernels. In this case, it is not at all clear which timestamp to use. I > definitely won't use the kernel as a source of timestamp. For that you can pick the newest one. > P.S. Please stop reopen this bug until you convince me or I'll just stop > responding. Ok, but how will I know that I've convinced you?
(Ответ для Vladimir D. Seleznev на комментарий #9) > > To verify that initramfs was not infected you need to get checksum after > > _each_ initramfs creation and check it every boot. This checksum will be the > > same between rebuilds. Reproducibility has nothing to do with this problem. > > No, you have been misleaded: I've talked about reproducibility of > installation and bootable images which cannot be achieved without initrd > image reproducibility. In fact, if someone could infect my initrd, she > probably could modify my checksum prog that I could not notice that > infection. > > The reproducibility of installation images, on the other hand, is very > convenient to test different build environments for suspicious side effects, > for example. You are reinventing the wheel. We already have a mechanism to prevent initramfs and kernel spoofing - secure boot. The entire chain, starting from bios, will be certified. Grub2 can check signatures [1] if you do it for your self. And this solution does not require any additional make-initrd changes. I'm not an expert in secure boot, but kernel and initramfs verification should be done at a higher level - bootloader. And again this has nothing to do with reproducibility. To check whether the generated image has been modified, you do not need to recreate it at all. You need a checksum from the content and you need to keep this checksum separately in a safe place. [1] https://www.gnu.org/software/grub/manual/grub/html_node/Using-digital-signatures.html#Using-digital-signatures > > Please give me a real life usecase that cannot be solved without > > reproducible initramfs. > > For now you cannot build reproducible bootable/installation image because > you also need to build initramfs for it. You need to generate initramfs for this particular hardware configuration anyway. Regenerating initramfs for validation doesn't make sense to me. > > > Sure *they are* related in some way: you build initrd image for the > > > particular kernel, and usually it contains some modules for that kernel. You > > > can try to load some kernel with an initrd built for another one but it has > > > a little sense: the modules packed in that initrd most likely cannot be > > > loaded with this kernel. Anyway we need to chose some timestamp, and at last > > > there is some reasoning for the proposed one. > > > > No, they are not related. Or you don't understand the essence of > > ctime/mtime. We create another new image with _new_ files (not only copied > > from the system). ctime should reflect the actual creation time and not > > mislead that the initramfs was created when the kernel was compiled. That's > > just wrong. The more I think about it, the less I like this idea in general. > > To be clear: I'm not talking about ctime/mtime of the initramfs itself, it's > all about timestamps that are packed inside of it. The [cm]time of initramfs > do not relate to the reproducibility. I know how to implement a SOURCE_DATE_EPOCH even for out-of-tree features, but I don't see any usecases for this. > Ok, but how will I know that I've convinced you? It will be then when I agree, and I'll write about it.
(In reply to Alexey Gladkov from comment #10) > (Ответ для Vladimir D. Seleznev на комментарий #9) > > > To verify that initramfs was not infected you need to get checksum after > > > _each_ initramfs creation and check it every boot. This checksum will be the > > > same between rebuilds. Reproducibility has nothing to do with this problem. > > > > No, you have been misleaded: I've talked about reproducibility of > > installation and bootable images which cannot be achieved without initrd > > image reproducibility. In fact, if someone could infect my initrd, she > > probably could modify my checksum prog that I could not notice that > > infection. > > > > The reproducibility of installation images, on the other hand, is very > > convenient to test different build environments for suspicious side effects, > > for example. > > You are reinventing the wheel. We already have a mechanism to prevent > initramfs and kernel spoofing - secure boot. The entire chain, starting from > bios, will be certified. Grub2 can check signatures [1] if you do it for > your self. And this solution does not require any additional make-initrd > changes. I'm not an expert in secure boot, but kernel and initramfs > verification should be done at a higher level - bootloader. No, I'm not reinventing the wheel: the boot process is not my concern here, I already wrote that. The boot process is not my concern, I'm trying to solve another task. > And again this has nothing to do with reproducibility. To check whether the > generated image has been modified, you do not need to recreate it at all. > You need a checksum from the content and you need to keep this checksum > separately in a safe place. No, my concern is about reproducibility, but in the generated *installation* images, like ISO/USB flash images, if it make it clear, that contain generated initramfs images, that SHOULD be reproducible to make installation ISO/USB images reproducible too. > [1] > https://www.gnu.org/software/grub/manual/grub/html_node/Using-digital- > signatures.html#Using-digital-signatures > > > > Please give me a real life usecase that cannot be solved without > > > reproducible initramfs. > > > > For now you cannot build reproducible bootable/installation image because > > you also need to build initramfs for it. > > You need to generate initramfs for this particular hardware configuration > anyway. > Regenerating initramfs for validation doesn't make sense to me. > > > > > Sure *they are* related in some way: you build initrd image for the > > > > particular kernel, and usually it contains some modules for that kernel. You > > > > can try to load some kernel with an initrd built for another one but it has > > > > a little sense: the modules packed in that initrd most likely cannot be > > > > loaded with this kernel. Anyway we need to chose some timestamp, and at last > > > > there is some reasoning for the proposed one. > > > > > > No, they are not related. Or you don't understand the essence of > > > ctime/mtime. We create another new image with _new_ files (not only copied > > > from the system). ctime should reflect the actual creation time and not > > > mislead that the initramfs was created when the kernel was compiled. That's > > > just wrong. The more I think about it, the less I like this idea in general. > > > > To be clear: I'm not talking about ctime/mtime of the initramfs itself, it's > > all about timestamps that are packed inside of it. The [cm]time of initramfs > > do not relate to the reproducibility. > > I know how to implement a SOURCE_DATE_EPOCH even for out-of-tree features, > but I don't see any usecases for this. Yes, the SOURCE_DATE_EPOCH would be nice, and there is an use-case, and I alredy wrote it: to make installation/bootable ISO/USB images reproducible. Because they contain initramfs images too. Which are not reproducible now. Sadly. > > Ok, but how will I know that I've convinced you? > > It will be then when I agree, and I'll write about it. And again: the boot process *is not* my concern here.
(Ответ для Vladimir D. Seleznev на комментарий #11) > No, my concern is about reproducibility, but in the generated *installation* > images, like ISO/USB flash images, if it make it clear, that contain > generated initramfs images, that SHOULD be reproducible to make installation > ISO/USB images reproducible too. To check the ISO/USB image itself you _don't_ need to rebuild it from scratch. You need to check an image checksum. I don’t understand where "SHOULD" came from. It doesn't follow from anything. I can assume that someone from government insists on the reproducibility of the images you provide but that's another story. Such a narrow task is easy to solve when creating such a _special_ image. It's special because without knowing the timestamp (SOURCE_DATE_EPOCH), you will never reproduce the image exactly the same. > Yes, the SOURCE_DATE_EPOCH would be nice, and there is an use-case, and I > alredy wrote it: to make installation/bootable ISO/USB images reproducible. > Because they contain initramfs images too. Which are not reproducible now. > Sadly. This is not true. You can always repack the image with any timestamp you need. This can be part of the code that will reproduce the image. You will need such a general script anyway.
After offlist discussion with Dmitry V. Levin and Gleb Fotengauer-Malinovskiy, I agreed that the date is not important in the initramfs image and it doesn't need to be stored. It can be 0. It makes sense to me. It does not require heuristics or extra variables and is easy to implement.
make-initrd-2.23.0-alt1 -> sisyphus: Sat Sep 11 2021 Alexey Gladkov <legion@altlinux.ru> 2.23.0-alt1 - New version (2.23.0). - Feature ucode: The absence of the firmware file is not an error (ALT#40790). - Set mtime of all initramfs files and directories to 01-01-1970 (ALT#40873).