anything-llm

mirror of https://github.com/Mintplex-Labs/anything-llm synced 2026-04-25 17:15:37 +02:00

Files

Marcello Fitton f7b90571be Fetch, Parse, and Create Documents for Statically Hosted Files (#4398 )

* Add capability to web scraping feature for document creation to download and parse statically hosted files

* lint

* Remove unneeded comment

* Simplified process by using key of ACCEPTED_MIMES to validate the response content type, as a result unlocked all supported files

* Add TODO comments for future implementation of asDoc.js to handle standard MS Word files in constants.js

* Return captureAs argument to be exposed by scrapeGenericUrl and passed into getPageContent | Return explicit argument of captureAs into scrapeGenericUrl in processLink fn

* Return debug log for scrapeGenericUrl

* Change conditional to a guard clause.

* Add error handling, validation, and JSDOC to getContentType helper fn

* remove unneeded comments

* Simplify URL validation by reusing module

* Rename downloadFileToHotDir to downloadURIToFile and moved up to a global module | Add URL valuidation to downloadURIToFile

* refactor

* add support for webp
remove unused imports

---------

Co-authored-by: timothycarambat <rambat1010@gmail.com>

2025-10-01 15:49:05 -07:00

index.js

Fetch, Parse, and Create Documents for Statically Hosted Files (#4398 )

2025-10-01 15:49:05 -07:00