Files
serenity/Userland/Libraries/LibPDF/DocumentParser.cpp
Nico Weber 536d27fbe6 LibPDF: Don't fail on files where object 0 is not in the xref table
History:

* 72f693e9ed from #6974 added the initial XRefTable. Here, -1
  was used for byte_offset of invalid entries. has_object() compared
  byte_offset to -1.

* e23bfd7252 from #7675 added invalid_byte_offset (equal to
  LONG_MAX) and initialized byte_offset with it, but forgot to update
  has_object(). has_object() still compared to -1, so has_object would
  now never return false.

* d1bc89e30b from #16150 added validate_xref_table_and_fix_if_necessary,
  which used `byte_offset_for_object(index) == invalid_byte_offset` to
  detect if an object was in the xref table. `byte_offset_for_object`
  internally did `VERIFY(has_object(index))`, which due to the previous
  bullet was always true. It ran this for all object numbers from 0
  up to the first object with byte_offset != invalid_byte_offset.

* d458471e09 from #24099 updated has_object() to check against
  invalid_byte_offset instead of -1, making it work again -- but causing
  a VERIFY in validate_xref_table_and_fix_if_necessary(). This caused
  validate_xref_table_and_fix_if_necessary() to VERIFY if object 0
  was not in the xref table.

* The fix is to make validate_xref_table_and_fix_if_necessary() call
  has_object() to find out if an object exists in the xref table.
  (When validate_xref_table_and_fix_if_necessary(), that didn't work,
  because has_object() was broken then.)

This is hit 4 times in my 1000 file test set. For these three files,
the xref is valid:

* 0000200.pdf
* 0000567.pdf
* 0000651.pdf

For 0000900.pdf, validate_xref_table_and_fix_if_necessary() actually
fixes up the xref table.

Fixes #25079.
2024-10-18 21:54:38 -04:00

36 KiB