Marcello Fitton 42a41201a8 feat: Document Embedding Status Events | Refactor Document Embedding to Job Queue and Forked Process (#5254)
* implement native embedder job queue

* persist embedding progress across renders

* add development worker timeouts

* change to static method

* native reranker

* remove useless return

* lint

* simplify

* make embedding worker timeout value configurable by admin

* add event emission for missing data

* lint

* remove onProgress callback argument

* make rerank to rerankDirect

* persists progress state across app reloads

* remove chunk level progress reporting

* remove unuse dvariable

* make NATIVE_RERANKING_WORKER_TIMEOUT user configurable

* remove dead code

* scope embedding progress per-user and clear stale state on SSE reconnect

* lint

* revert vector databases and embedding engines to call their original methods

* simplify rerank

* simplify progress fetching by removing updateProgressFromApi

* remove duplicate jsdoc

* replace sessionStorage persistence with server-side history replay for embedding progress

* fix old comment

* fix: ignore premature SSE all_complete when embedding hasn't started yet

The SSE connection opens before the embedding API call fires, so the
server sees no buffered history and immediately sends all_complete.
Firefox dispatches this eagerly enough that it closes the EventSource
before real progress events arrive, causing the progress UI to clear
and fall back to the loading spinner. Chrome's EventSource timing
masks the race.

Track slugs where startEmbedding was called but no real progress event
has arrived yet via awaitingProgressRef. Ignore the first all_complete
for those slugs and keep the connection open for the real events.

* reduce duplication with progress emissions

* remove dead code

* refactor: streamline embedding progress handling

Removed unnecessary tracking of slugs for premature all_complete events in the EmbeddingProgressProvider. Updated the server-side logic to avoid sending all_complete when no embedding is in progress, allowing the connection to remain open for real events. Adjusted the embedding initiation flow to ensure the server processes the job before the SSE connection opens, improving the reliability of progress updates.

* fix stale comment

* remove unused function

* fix event emissions for document creation failure

* refactor: move Reranking Worker Idle Timeout input to LanceDBOptions component

Extracted the Reranking Worker Idle Timeout input from GeneralEmbeddingPreference and integrated it into the LanceDBOptions component. This change enhances modularity and maintains a cleaner structure for the settings interface.

* lint

* remove unused hadHistory vars

* refactor workspace directory by hoisting component and converting into functions

* moved EmbeddingProgressProvider to wrap Document Manager Modal

* refactor embed progress SSE connection to use fetchEventSource instead of native EventSource API.

* refactor message handlng into a function and reduce duplication

* refactor: utilize writeResponseChunk for event emissions in document embedding progress SSE

* refactor: explicit in-proc embedding and rerank methods that are called by workers instead of process.send checks

* Abstract EmbeddingProgressBus and Worker Queue into modules

* remove error and toast messages on embed process result

* use safeJsonParse

* add chunk-level progress events with per-document progress bar in UI

* remove unused parameter

* rename all worker timeout references to use ttl | remove ttl updating from UI

* refactor: pass embedding context through job payload instead of global state

* lint

* add graceful shutdown for workers

* apply figma styles

* refactor embedding worker to use bree

* use existing WorkerQueue class as the management layer for jobs

* lint

* revert all reranking worker changes back to master state

Removes the reranking worker queue, rerankViaWorker/rerankInProcess
renames, and NATIVE_RERANKING_WORKER_TTL config so this branch
only contains the embedding worker job queue feature.

* remove breeManaged flag — WorkerQueue always spawns via Bree

* fix prompt embedding bug

* have embedTextInput call embedChunksInProcess

* add message field to `process.send()`

* remove nullish check and error throw

* remove bespoke graceful shutdown logix

* add spawnWorker method and asbtract redudant flows into helper methods

* remove unneeded comment

* remove recomputation of TTL value

* frontend cleanup and refactor

* wip on backend refactor

* backend overhaul

* small lint

* second pass

* add logging, update endpoint

* simple refactor

* add reporting to all embedder providers

* fix styles

---------

Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
2026-04-06 17:00:15 -07:00
2025-11-20 15:12:15 -08:00
2026-04-02 13:52:00 -07:00
2026-04-02 13:52:00 -07:00
2025-04-27 16:40:57 -07:00
2026-03-04 15:45:06 -08:00
2026-03-19 12:21:26 -07:00
2026-03-19 12:21:26 -07:00
2024-01-08 15:31:06 -08:00
2025-01-09 15:39:56 -08:00
2025-04-07 13:45:16 -07:00
2024-06-06 12:50:42 -07:00
2025-09-03 11:12:07 -07:00
2023-06-03 19:28:07 -07:00
2026-03-19 12:21:26 -07:00
2023-09-08 16:31:30 -07:00
2026-03-09 15:45:22 -07:00

AnythingLLM logo

Mintplex-Labs%2Fanything-llm | Trendshift

AnythingLLM: The all-in-one AI app you were looking for.
Chat with your docs, use AI Agents, hyper-configurable, multi-user, & no frustrating setup required.

Discord | License | Docs | Hosted Instance

English · 简体中文 · 日本語

👉 AnythingLLM for desktop (Mac, Windows, & Linux)! Download Now

Chat with your docs. Automate complex workflows with AI Agents. Hyper-configurable, multi-user ready, battle-tested—and runs locally by default with zero setup friction.

Chatting

Watch the demo!

Watch the video

Product Overview

AnythingLLM is the all-in-one AI application that lets you build a private, fully-featured ChatGPT—without compromises. Connect your favorite local or cloud LLM, ingest your documents, and start chatting in minutes. Out of the box you get built-in agents, multi-user support, vector databases, and document pipelines — no extra configuration required.

AnythingLLM supports multiple users as well where you can control the access and experience per user without compromising the security or privacy of the instance or your intellectual property.

Cool features of AnythingLLM

  • Intelligent Skill Selection Enable unlimited tools for your models while reducing token usage by up to 80% per query
  • No-code AI Agent builder
  • Full MCP-compatibility
  • Multi-modal support (both closed and open-source LLMs!)
  • Custom AI Agents
  • 👤 Multi-user instance support and permissioning Docker version only
  • 🦾 Agents inside your workspace (browse the web, etc)
  • 💬 Custom Embeddable Chat widget for your website Docker version only
  • 📖 Multiple document type support (PDF, TXT, DOCX, etc)
  • Intuitive chat UI with drag-and-drop uploads and source citations.
  • Production-ready for any cloud deployment.
  • Works with all popular closed and open-source LLM providers.
  • Built-in optimizations for large document sets—lower costs and faster responses than other chat UIs.
  • Full Developer API for custom integrations!
  • ...and much more—install in minutes and see for yourself.

Supported LLMs, Embedder Models, Speech models, and Vector Databases

Large Language Models (LLMs):

Embedder models:

Audio Transcription models:

TTS (text-to-speech) support:

STT (speech-to-text) support:

  • Native Browser Built-in (default)

Vector Databases:

Technical Overview

This monorepo consists of six main sections:

  • frontend: A viteJS + React frontend that you can run to easily create and manage all your content the LLM can use.
  • server: A NodeJS express server to handle all the interactions and do all the vectorDB management and LLM interactions.
  • collector: NodeJS express server that processes and parses documents from the UI.
  • docker: Docker instructions and build process + information for building from source.
  • embed: Submodule for generation & creation of the web embed widget.
  • browser-extension: Submodule for the chrome browser extension.

🛳 Self-Hosting

Mintplex Labs & the community maintain a number of deployment methods, scripts, and templates that you can use to run AnythingLLM locally. Refer to the table below to read how to deploy on your preferred environment or to automatically deploy.

Docker AWS GCP Digital Ocean Render.com
Deploy on Docker Deploy on AWS Deploy on GCP Deploy on DigitalOcean Deploy on Render.com
Railway RepoCloud Elestio Northflank
Deploy on Railway Deploy on RepoCloud Deploy on Elestio Deploy on Northflank

or set up a production AnythingLLM instance without Docker →

How to setup for development

  • yarn setup To fill in the required .env files you'll need in each of the application sections (from root of repo).
    • Go fill those out before proceeding. Ensure server/.env.development is filled or else things won't work right.
  • yarn dev:server To boot the server locally (from root of repo).
  • yarn dev:frontend To boot the frontend locally (from root of repo).
  • yarn dev:collector To then run the document collector (from root of repo).

Learn about documents

Telemetry & Privacy

AnythingLLM by Mintplex Labs Inc contains a telemetry feature that collects anonymous usage information.

More about Telemetry & Privacy for AnythingLLM

Why?

We use this information to help us understand how AnythingLLM is used, to help us prioritize work on new features and bug fixes, and to help us improve AnythingLLM's performance and stability.

Opting out

Set DISABLE_TELEMETRY in your server or docker .env settings to "true" to opt out of telemetry. You can also do this in-app by going to the sidebar > Privacy and disabling telemetry.

What do you explicitly track?

We will only track usage details that help us make product and roadmap decisions, specifically:

  • Type of your installation (Docker or Desktop)

  • When a document is added or removed. No information about the document. Just that the event occurred. This gives us an idea of use.

  • Type of vector database in use. This helps us prioritize changes when updates arrive for that provider.

  • Type of LLM provider & model tag in use. This helps us prioritize changes when updates arrive for that provider or model, or combination thereof. eg: reasoning vs regular, multi-modal models, etc.

  • When a chat is sent. This is the most regular "event" and gives us an idea of the daily-activity of this project across all installations. Again, only the event is sent - we have no information on the nature or content of the chat itself.

You can verify these claims by finding all locations Telemetry.sendTelemetry is called. Additionally these events are written to the output log so you can also see the specific data which was sent - if enabled. No IP or other identifying information is collected. The Telemetry provider is PostHog - an open-source telemetry collection service.

We take privacy very seriously, and we hope you understand that we want to learn how our tool is used, without using annoying popup surveys, so we can build something worth using. The anonymous data is never shared with third parties, ever.

[View all telemetry events in source code](https://github.com/search?q=repo%3AMintplex-Labs%2Fanything-llm%20.sendTelemetry(&type=code)

👋 Contributing

💖 Sponsors

Premium Sponsors

User avatar: DCS DIGITAL

All Sponsors

User avatar: JaschaUser avatar: KickAssUser avatar: ShadowArcanistUser avatar: AtlasUser avatar: Predrag StojadinovićUser avatar: Diego SpinolaUser avatar: KyleUser avatar: Giulio De PasqualeUser avatar: User avatar: MacStadiumUser avatar: User avatar: User avatar: User avatar: User avatar: DennisUser avatar: Michael Hamilton, Ph.D.User avatar: User avatar: TernaryLabsUser avatar: Daniel CelaUser avatar: AlessoUser avatar: Rune MathisenUser avatar: User avatar: User avatar: AlanUser avatar: Damien PetersUser avatar: DCS DigitalUser avatar: Paul McilreavyUser avatar: Til WolfUser avatar: Leopoldo Crhistian Riverin GomezUser avatar: AJEsauUser avatar: Steven VanOmmerenUser avatar: Casey BoettcherUser avatar: User avatar: AvineetUser avatar: ChrisUser avatar: mirkoUser avatar: Tim ChampUser avatar: Peter MathisenUser avatar: Ed di GirolamoUser avatar: Wojciech MiłkowskiUser avatar: ADS FundUser avatar: arc46 GmbHUser avatar: Li YinUser avatar: SylphAI

🌟 Contributors

anythingllm contributors

Star History Chart

🔗 More Products

  • VectorAdmin: An all-in-one GUI & tool-suite for managing vector databases.
  • OpenAI Assistant Swarm: Turn your entire library of OpenAI assistants into one single army commanded from a single agent.


Copyright © 2026 Mintplex Labs.
This project is MIT licensed.

Description
Mirrored from GitHub
Readme MIT 223 MiB
Languages
JavaScript 98.6%
CSS 1.1%
Dockerfile 0.2%