8019 Commits

Author SHA1 Message Date
f047b47283 Merge pull request 'feat: add Snapchat extractor, improve browser auth and XenForo' (#3) from feat/snapchat-extractor-and-improvements into master
Reviewed-on: #3
2026-02-25 16:33:57 +01:00
ca342ee3a3 feat: add Snapchat extractor, improve browser auth and XenForo support
- Add new Snapchat story extractor with spotlight and user story support
- Expand browser cookie extraction to support Zen Browser and multi-platform profiles
- Significantly enhance XenForo extractor with gallery, media, and attachment support
- Add APPDATA-based profile discovery for Windows browsers
- Update main.rs with new extractor wiring and improved CLI handling

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 16:29:16 +01:00
ea038e60c2 Merge pull request 'docs: add project README' (#2) from add-readme into master
Reviewed-on: #2
2026-02-25 12:15:25 +01:00
7b92eb523d docs: add project README
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 11:53:44 +01:00
e4dae6de12 Merge pull request 'feat: complete rust parity and remove legacy Python codebase' (#1) from rust-parity-remove-python into master
Reviewed-on: #1
2026-02-25 11:23:02 +01:00
8d9ab11892 feat: complete rust parity and remove legacy Python codebase 2026-02-25 11:06:59 +01:00
9666aaac3f adding dozens of extractors in rust 2026-02-17 11:12:29 +01:00
51c19d9743 Fix: Wire OAuth config to extractors via set_oauth method
- Add set_oauth() method to Extractor trait (base.rs)
- Implement set_oauth() in PixivExtractor (pixiv.rs)
- Implement set_oauth() in DeviantArtExtractor (deviantart.rs)
- Wire OAuth config to extractors in main.rs (was placeholder)

This fixes the gap found during Phase 6 verification where OAuth
tokens from config were loaded but never passed to extractors.
2026-02-16 10:43:09 +01:00
9c125e08ce docs(06-04): complete Wire Simulate, Destination & OAuth Config plan
- Created summary for plan 06-04
- Updated STATE.md with new plan completion and metrics
- Implemented --simulate dry-run mode, --destination wiring, and OAuth config support
2026-02-16 10:30:53 +01:00
6c560ca5f5 feat(06-04): add OAuth config support for extractors
- Added OauthConfig struct with access_token, refresh_token, client_id, client_secret
- Added oauth field to ExtractorConfig (HashMap<String, OauthConfig>)
- Config file can now specify OAuth tokens per extractor (e.g., pixiv, deviantart)
- Merges OAuth config when loading multiple config files
- Added OAuth config lookup in main.rs for Pixiv and DeviantArt
2026-02-16 10:28:02 +01:00
31012324a8 feat(06-04): wire --destination to downloads
- Use CLI --destination > config.downloader.directory > default (.)
- Create destination directory if it doesn't exist before downloading
- Supports both CLI argument and config file directory setting
2026-02-16 10:26:49 +01:00
3268ceb07e feat(06-04): implement --simulate dry-run mode
- Added simulate mode check before download loop
- When --simulate is set, prints URLs that would be downloaded but skips actual download
- Added destination directory creation if it doesn't exist
- Uses --destination argument for download location
2026-02-16 10:25:43 +01:00
62048028f4 docs: update STATE.md for plan 06-03 completion 2026-02-16 10:23:29 +01:00
0d68e348b0 docs(06-03): complete CLI args and cookie wiring plan 2026-02-16 10:21:52 +01:00
1cda24bf4f feat(06-03): wire cookies to extractors in main.rs
- Added cookie loading from --cookies and --cookies-from-browser CLI args
- Added set_cookies() method to Extractor trait (default no-op)
- Implemented set_cookies() for TwitterExtractor and InstagramExtractor
- Extractors now receive cookies during initialization for authenticated requests
- All 145 library tests pass
2026-02-16 10:20:29 +01:00
1e73893a4b feat(06-03): add input-file URL reading to main.rs
- Added load_urls_from_file() function to parse URLs from input files
- Supports comments (lines starting with #) and empty lines
- Combines CLI URLs with input file URLs for processing
- Input files processed in order, duplicates allowed
2026-02-16 10:17:30 +01:00
3bae765656 feat(06-03): export extract_browser_cookies in lib.rs
- Added extract_browser_cookies to public re-exports
- Auth module now fully accessible from library
2026-02-16 10:16:45 +01:00
465b2146e1 docs(06-02): complete Browser Cookie Extraction plan
- Summary documents browser cookie extraction implementation
- Firefox and Chrome profile detection and cookie extraction working
- Updated STATE.md with progress and metrics
2026-02-16 10:14:41 +01:00
e9650c23ea fix(06-02): fix Chrome cookie extraction and borrow checker issue
- Simplified Chrome extraction to avoid duplicate code
- Fixed borrow of moved value in Firefox extraction match arms
- Removed unused temp_path shadowing in Chrome extraction
- All 145 library tests pass
2026-02-16 10:12:49 +01:00
e463d17404 feat(06-02): export browser extraction functions in auth module
- Added pub mod browser to expose browser extraction module
- Re-exported extract_browser_cookies, extract_firefox_cookies, extract_chrome_cookies
- Re-exported find_firefox_profile and find_chrome_profile
- Added tempfile as runtime dependency for cookie database copying
2026-02-16 10:08:46 +01:00
43f1f8d87a feat(06-02): add browser cookie extraction module
- Implements find_firefox_profile() to locate Firefox profile directories
- Implements extract_firefox_cookies() to read cookies from Firefox SQLite database
- Implements find_chrome_profile() to locate Chrome profile directories
- Implements extract_chrome_cookies() to read cookies from Chrome SQLite database
- Handles encrypted Chrome cookies gracefully with warnings
- Implements extract_browser_cookies() as unified API for both browsers
- Uses tempfile to avoid database locking issues
- Adds comprehensive error handling and logging
2026-02-16 10:07:59 +01:00
09675aa49a docs(06-01): complete Cookie File Support plan
- Create 06-01-SUMMARY.md
- Update STATE.md with completed plan progress
- Phase 6 Auth & CLI: 1/6 plans complete
2026-02-16 10:05:02 +01:00
4d2ae7efbc feat(06-01): add --cookies and --cookies-from-browser CLI args
- Add --cookies <FILE> for Netscape-format cookie files
- Add --cookies-from-browser <BROWSER> for browser cookie extraction
- Add CLI tests for both arguments
- 140 tests pass
2026-02-16 10:02:15 +01:00
724df70a9c feat(06-01): implement Netscape cookie file parser
- Create src/auth/cookies.rs with parse_netscape_cookies()
- Implement load_cookies_from_file() for file-based cookie loading
- Support Netscape HTTP Cookie File format (tab-separated)
- Add CookieError type for error handling
- Add tests for parsing, loading, and roundtrip operations
- Export auth module in lib.rs
- 137 tests pass
2026-02-16 10:00:55 +01:00
af93966260 feat(06-01): create auth module structure
- Create src/auth/mod.rs with module declarations
- Export cookies submodule
- Re-export HashMap for use in extractors
2026-02-16 09:58:44 +01:00
8222afde1c docs(roadmap): mark Phase 6 plans as created 2026-02-16 09:54:21 +01:00
e9a8c41232 docs(phase-6): add auth and CLI plans (4 plans, 3 waves) 2026-02-16 09:53:23 +01:00
4c7d3eb2e2 docs(06-auth-cli): create phase plans for authentication and CLI features
- Plan 06-01: Cookie file support (--cookies CLI + Netscape parser)
- Plan 06-02: Browser cookie extraction (Firefox, Chrome)
- Plan 06-03: Main integration (wire cookies, OAuth, simulate, input-file, destination)
2026-02-16 09:47:01 +01:00
53300997b6 docs(roadmap): mark Phase 5 complete 2026-02-16 09:36:21 +01:00
a78d1bdc8f docs(phase-5): complete verification - all 6 truths verified 2026-02-16 09:36:08 +01:00
796011bad0 fix(05): implement actual ZIP file collection in ZipPostProcessor 2026-02-16 09:35:10 +01:00
419addf0d7 docs(05-03): complete download archive plan
- Added SqliteArchive with DownloadArchive trait
- CLI options --download-archive and --download-archive-skip-duplicates
- Archive integration in DownloadManager
- All 129 tests pass
2026-02-16 09:27:38 +01:00
2117d5d6fe feat(05-03): implement SQLite download archive for duplicate detection
- Added rusqlite dependency with bundled feature
- Created src/archive/mod.rs with DownloadArchive trait and SqliteArchive
- Added --download-archive CLI option for archive database path
- Added --download-archive-skip-duplicates flag with default path
- Integrated archive checking in DownloadManager before download
- Records successful downloads to archive after completion
- All 129 tests pass
2026-02-16 09:25:23 +01:00
2eeb8f7d6b docs(05-02): complete custom command execution plan
- Created 05-02-SUMMARY.md with execution details
- Updated STATE.md with plan 2 completion
- All 125 tests pass
2026-02-16 09:15:24 +01:00
976db71505 feat(05-02): add custom command execution post-processor
- Create ExecPostProcessor for running commands on downloaded files
- Add --exec CLI option for specifying custom commands
- Support {} placeholder for file path replacement
- Set environment variables: FILE_PATH, FILE_NAME, FILE_DIR, FILE_SIZE, FILE_URL
- Command failures are logged but don't stop download pipeline
- All 125 tests pass
2026-02-16 09:12:41 +01:00
fc499cb2d6 docs(05-01): complete post-processing foundation plan
- Created SUMMARY.md with phase details
- Updated STATE.md with current position
- Added decisions and new files to tracking
2026-02-16 09:06:06 +01:00
e441915ab9 feat(05-01): add CLI options and export postprocess module
- Added CLI options: --zip, --metadata, --zip-compress
- Added parse_zip_compression() helper function
- Exported postprocess module in lib.rs
- Re-exported PostProcessor, PostProcessorConfig, DownloadMetadata,
  ZipPostProcessor, MetadataPostProcessor, and config types
- Added postprocess module to Cargo.toml (chrono dependency)
- All 112 tests pass
2026-02-16 09:01:41 +01:00
1e01cffa94 feat(05-01): implement MetadataPostProcessor for JSON sidecar files
- Created src/postprocess/metadata.rs with MetadataPostProcessor struct
- Implements PostProcessor trait with process() and finalize() methods
- Writes .metadata.json files next to downloaded files
- Includes ExtendedMetadata for additional structured fields
- Added tests for metadata path generation and JSON writing
- All 106 tests pass
2026-02-16 08:58:28 +01:00
1b6dfeec8f feat(05-01): implement ZipPostProcessor for archive creation
- Created src/postprocess/zip.rs with ZipPostProcessor struct
- Implements PostProcessor trait with process() and finalize() methods
- Supports compression (deflate) and storage modes
- Added StreamingZipWriter for large archive handling
- Added tests for compression methods
- All 106 tests pass
2026-02-16 08:57:50 +01:00
14938697b3 feat(05-01): create postprocess module with PostProcessor trait
- Added src/postprocess/mod.rs with PostProcessor trait
- DownloadMetadata struct for tracking file download information
- PostProcessorConfig enum for configuring post-processors
- PostProcessorBuilder for composing multiple post-processors
- ZipConfig and MetadataConfig for specific configurations
- Added chrono dependency for timestamp support
2026-02-16 08:56:18 +01:00
ca7f287a00 feat(05-01): add zip and walkdir dependencies
- Added zip = { version = "8.0", features = ["deflate"] } for ZIP archive creation
- Added walkdir = "2.5" for directory traversal
- Dependencies resolve without conflicts
2026-02-16 08:55:11 +01:00
1432d7564b docs(roadmap): mark Phase 5 plans as created 2026-02-16 08:52:31 +01:00
271efbd9ac docs(phase-5): add post-processing and archive plans (3 plans, 1 wave) 2026-02-16 08:52:05 +01:00
938b9da740 docs(phase-5): research post-processing & archive features 2026-02-16 08:40:26 +01:00
512949b2f1 docs(phase-4): complete download pipeline - all 4 truths verified 2026-02-16 08:32:46 +01:00
04abae0fd9 fix(04): wire --jobs flag to DownloadWorker for parallel downloads 2026-02-16 08:31:28 +01:00
ae507ac7cd docs(04-04): complete file filtering plan
- Added FileFilter struct with min/max size and MIME type filtering
- Added CLI options: --filter-size-min, --filter-size-max, --filter-type
- Integrated filtering into DownloadManager and main.rs
- All Phase 4 features complete: progress, resume, concurrency, templates, filtering
2026-02-16 08:24:49 +01:00
8b07ae8770 feat(04-04): integrate filtering into main.rs
- Added FileFilter export to lib.rs
- Updated main.rs to parse filter CLI arguments
- Added build_filter() function to convert CLI args to FileFilter
- Added verbose logging of filter configuration
- Added summary output showing filter is configured
- All 106 tests pass
2026-02-16 08:19:46 +01:00
51c95c70c6 feat(04-04): add filtering CLI options
- Added --filter-size-min option for minimum file size
- Added --filter-size-max option for maximum file size
- Added --filter-type option for allowed file types
- Added parse_size() utility to convert sizes like '1kb', '1mb', '1gb' to bytes
- Added tests for parse_size() function
- All 106 tests pass
2026-02-16 08:16:43 +01:00
5f3024efad feat(04-04): add file filtering to DownloadOptions
- Added FileFilter struct with min_size, max_size, and allowed_types
- Added filter() method to check if download should proceed based on size/type
- Added Filtered error variant to DownloadError
- Updated DownloadManager::download() to apply filters before downloading
- Filter checks Content-Type header and Content-Length against criteria
- All 105 tests pass
2026-02-16 08:15:25 +01:00