mirror of
https://github.com/Mintplex-Labs/anything-llm
synced 2026-04-25 17:15:37 +02:00
feat: Add multilingual support for ocr module (#3325)
* Add multilingual support for ocr mudule * Add OCR langauge as server var that is passed into Collector Support all valid tesseract language codes Filter and parse only valid codes with fallbacks' * persist TARGET_OCR_LANG * update docker example env --------- Co-authored-by: Timothy Carambat <rambat1010@gmail.com>
This commit is contained in:
@@ -321,3 +321,8 @@ GID='1000'
|
||||
# Enable simple SSO passthrough to pre-authenticate users from a third party service.
|
||||
# See https://docs.anythingllm.com/configuration#simple-sso-passthrough for more information.
|
||||
# SIMPLE_SSO_ENABLED=1
|
||||
|
||||
# Specify the target languages for when using OCR to parse images and PDFs.
|
||||
# This is a comma separated list of language codes as a string. Unsupported languages will be ignored.
|
||||
# Default is English. See https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html for a list of valid language codes.
|
||||
# TARGET_OCR_LANG=eng,deu,ita,spa,fra,por,rus,nld,tur,hun,pol,ita,spa,fra,por,rus,nld,tur,hun,pol
|
||||
Reference in New Issue
Block a user