- Renamed local string library (string.c/h) to custom_string.c/h
to avoid conflicts with system <string.h>.
- Updated include directives for the renamed string library.
- Ensured custom_string.c includes custom_string.h.
- Adjusted feature test macros in epub2txt.c (_POSIX_C_SOURCE, _GNU_SOURCE).
- Added an explicit prototype for asprintf in epub2txt.c to resolve
undeclared function errors on macOS/Clang.
Reformat ruby annotations in braces.
Before: `01day.01month.1999year`
After: `01.01.1999(daymonthyear)`
This is mainly intended for asian languages, Japanese for example, where current handling of ruby creates a mix of words and readings. With this patch, readings are put in braces after words, improving readability.
Paths specified in href attributes inside an EPUB could potentially
point outside the EPUB container (e.g. href="../../../../outside").
Make sure this does not happen: abort parsing if the rootfile points
outside the EPUB container and skip parsing files with invalid paths,
printing a warning.
While easier to use, system() has the disadvantage of needing proper
escaping and validation on the passed parameter, which is passed
as is to a shell, in order to avoid malformed commands and possibly
execution of unwanted commands (command injection).
Add an utility function for running helper commands that uses fork()
+ execvp() + wait() instead of system(). Using execvp() has the
advantage of not needing any escaping on the parameters, and the
interface is also easier to use as ther is no need to construct
commands as a single string using sprintf() or similar functions.
In epub2txt_do_file(), the environment variables TMP and TMPDIR are
used as a base to construct the path for a temporary directory to
extract the EPUB contents into. A long enough value for TMP or TMPDIR
will overrun the fixed-size buffers allocated on the stack of the
function and break the program.
Dynamically allocate strings to avoid buffer overruns, and return early
when mkdir() fails, as there isn't much else to do.
so that `epub2txt` builds in a reproducible way
in spite of indeterministic filesystem readdir order
See https://reproducible-builds.org/ for why this is good.