9 Commits

Author SHA1 Message Date
Harald Sitter
2c88a6d04a mtimer: use structured logging and make the debuggy messages that
should reduce verbosity substantially
2025-12-10 23:54:34 +01:00
Harald Sitter
a28807e832 mtimer: apply stat info mtimes when called for
should prevent files from going 1970 on us
2025-08-27 04:34:44 +02:00
Harald Sitter
26c1640492 mtimer: links need special handling
I completely ignored the fact that symlinks can point to directories.
this then would make our checksum logic fall over because you can't
checksum a dirent. instead introduce completely bespoke handling for
symlinks. it's increeeedibly similar to files but ever so slightly
different so we have type assurances.

may be worth creating an Analyzer interface in the future so we don't
have to dupe things so much
2025-08-26 15:06:42 +02:00
Harald Sitter
f494ed4a60 mtimer: don't skip symlinks
after some stunts with dump.erofs I am led to believe that the remaining
delta chunks are in fact symlink (metadata?)

specifically I am seeing a non-matching chunk

`░ -- start: 3626777409 len: 82685`

which seems to cover the following extent in erofs

```
Path :
/usr/share/factory/var/lib/flatpak/runtime/org.kde.Platform/x86_64/6.9/f930fae18cfc829f51db18b9324905a3bebee0ec7e9d4d62afbb17f696fb20d0/files/share/icons/breeze-dark/status/22/rotation-locked-landscape-symbolic.svg
Size: 29  On-disk size: 29  symlink file
NID: 113336534   Links: 1   Layout: 2   Compression ratio: 100.00%
Inode size: 64   Xattr size: 0
Uid: 0   Gid: 0  Access: 0777/rwxrwxrwx
Timestamp: 2025-08-26 12:33:18.806496943

 Ext:   logical offset   |  length :     physical offset    |  length
   0:        0..      29 |      29 : 3626769152..3626769181 |      29
```

the trouble is that because the chunk is so large it's hard to tell what
the actual change is that causes the delta. considering the mtime
definitely is the build time it is my only guess right now
2025-08-26 14:29:09 +02:00
Harald Sitter
a42b386db4 mtimer: always restore time of dir
doesn't really make sense to guard this since we always want to set the
stable time. also allows us to get rid of the extra stat
2025-08-26 12:11:51 +02:00
Harald Sitter
8e3cafae79 mtimer: start the search for a dir time with unix(0)
if the dir is newer than the files then we still want to force it to a
consistent value of the files

most notably this should prevent a whole host of dirs from having an
mtime that is the package unpack time, which is obviously changing
between builds
2025-08-26 07:35:39 +02:00
Harald Sitter
bf5f2dd160 give dirs a stable time
with file mtimes stabilized, we now have dirs lighting up like a
christmas tree in my diff scripts. give them a stable mtime to get
consistency between builds.

the idea here is that if we set the mtime of all dirs to their latest
content's mtime we'll implicitly stabilize the dirs through stabilizing
the files

somewhat unfortunately we need to do this in a single thread because
otherwise we'd have to segment deep trees and I really don't want to
venture there for such an otherwise simple program

a future option might be to also put dirs in our json but realistically
that only makes a difference for empty dirs (since they have no content
from which to derive the mtime). so let's see where we get with this. we
can always add dir records in the json later
2025-08-26 06:35:24 +02:00
Harald Sitter
806f85a40a marshal condensed json
preciously recovers a byte or two
2025-08-26 06:35:23 +02:00
Harald Sitter
9719237988 try to produce more consistent mtimes
this is a bit of a shot in the dark, but I believe we may have
unnecessary delta in our images caused by the rebuilding of software on
a daily basis. this would result in mtimes changing when the files
actually do not.

a tiny mtimer tool is meant to work around that by consuming an input
json file of mtimes+checksums and if mtimes change it will checksum the
affected file to verify it actually has changed in content as well.
assuming reproducible builds this should result in far less delta in the
erofs and by extension the delta download
2025-08-25 18:06:43 +02:00