8019 Commits

Author SHA1 Message Date
6f5d575273 docs(04-03): complete concurrent downloads and path templates plan
- Created worker pool with tokio::sync::Semaphore
- Created path template parser with {placeholder} support
- Added --jobs CLI flag
- Integrated both into DownloadManager
- All 105 tests pass
2026-02-16 08:11:47 +01:00
240a670fc6 feat(04-03): export new download types in library API
- Exported PathTemplate and TemplateError for template parsing
- Exported DownloadItem, DownloadWorker, DownloadWorkerResult for batch downloads
- Exported download_batch function for concurrent downloads
- All 105 tests pass
2026-02-16 08:08:16 +01:00
b1daa0f5fb feat(04-03): integrate worker pool and templates into DownloadManager
- Added download_with_template() method for template-based downloads
- Exported PathTemplate, TemplateError, DownloadItem, DownloadWorker types
- Exported download_batch function for concurrent downloads
- All 105 tests pass
2026-02-16 08:07:32 +01:00
b4735c3fc3 feat(04-03): add --jobs flag to CLI
- Added -j/--jobs flag for concurrent download threads
- Default value is 1 (sequential downloads)
- All CLI tests pass
2026-02-16 08:06:53 +01:00
e52fafab7b feat(04-03): create path template parser
- Created PathTemplate struct for parsing {key} patterns
- Supports placeholders: {num}, {title}, {extension}, {filename}, {id}, {date}
- Renders templates with HashMap of values
- Sanitizes paths to prevent directory traversal (.., /, \)
- Added templates and worker modules to download mod.rs
- All 105 tests pass
2026-02-16 08:06:06 +01:00
6675dde1cf feat(04-03): create concurrent download worker pool
- Created DownloadWorker struct with bounded semaphore pattern
- Added download_batch() function for concurrent downloads
- Uses tokio::sync::Semaphore to limit concurrent downloads
- DownloadItem and DownloadWorkerResult structs for batch operations
- All tests pass
2026-02-16 08:02:48 +01:00
bc4a4b9162 docs(04-02): complete resume support plan
- Created summary documenting resume implementation
- Updated STATE.md with plan completion
- 2 tasks completed, 3 files modified
- All tests pass (92 tests)
2026-02-16 08:00:40 +01:00
787060d605 feat(04-02): export resume module in public API
- Re-export resume functions and types from lib.rs
- Includes: check_resume_support, get_resume_offset, get_partial_path,
  ResumeSupport, PART_EXTENSION, ResumeError
2026-02-16 07:58:34 +01:00
c60e1d2617 feat(04-02): integrate resume with .part file support
- Add pub mod resume to expose resume functionality
- Add resume::ResumeError to DownloadError enum
- Update DownloadManager.download() to use .part files:
  - Uses resume module's get_resume_offset() to check server support
  - Creates .part file during download, renames on success
  - Handles 416 Range Not Satisfiable and other resume errors
  - Properly integrates Accept-Ranges header checking
- Add type annotation for File variable
- Fix missing PathBuf import in resume.rs
- All tests pass (92 tests)
2026-02-16 07:58:01 +01:00
0206672743 feat(04-02): expose resume module in download
- Add pub mod resume to export resume functionality
- resume.rs already exists with Range header support
- Provides check_resume_support() and get_resume_offset()
2026-02-16 07:56:10 +01:00
6d5d1a3e4f feat(04-02): add resume module with Range header support
- Created src/download/resume.rs with ResumeSupport struct
- Implemented check_resume_support() to verify server Accept-Ranges header
- Added get_resume_offset() to combine server check with partial file detection
- Added PART_EXTENSION constant (.part)
- Handles 416 Range Not Satisfiable errors
- Verifies 206 Partial Content response before claiming resume works
- Tests for path utilities pass
2026-02-16 07:53:48 +01:00
ae53149058 docs(04-01): complete download manager plan
- Created 04-01-SUMMARY.md with full execution details
- Updated STATE.md to reflect Phase 4 in progress (1/6 plans)
2026-02-16 07:51:28 +01:00
8a48a778ef feat(04-01): register download module in lib.rs
- Added 'pub mod download' to src/lib.rs
- Exported DownloadManager, DownloadOptions, DownloadResult, DownloadError
- Users can now access via gallery_dl::DownloadManager
2026-02-16 07:49:04 +01:00
57f356c32d feat(04-01): create download module with streaming and progress
- Created src/download/mod.rs with DownloadOptions, DownloadResult, DownloadManager
- Uses reqwest streaming (bytes_stream()) to avoid buffering entire file
- Uses tokio::fs for async file I/O with resume support via Range headers
- Created src/download/progress.rs with DownloadProgress using indicatif
- Batches progress updates every 100ms to avoid flickering
- Added 'stream' feature to reqwest for streaming support
2026-02-16 07:48:51 +01:00
85f74efec8 feat(04-01): create progress tracking module with indicatif
- Created src/download/progress.rs with DownloadProgress struct
- Uses indicatif for professional-looking progress bars
- Batches updates every 100ms to avoid flickering (per research)
- Provides update(), finish(), set_message() methods
- Template: spinner + bar + position + ETA
2026-02-16 07:45:22 +01:00
32d4dbd547 feat(04-01): create download module with DownloadManager
- Created src/download/mod.rs with DownloadOptions, DownloadResult structs
- Implemented DownloadManager that wraps HttpClient for HTTP operations
- Uses reqwest streaming (bytes_stream()) to avoid buffering entire file
- Uses tokio::fs for async file I/O
- Supports resume from existing file using Range headers
- Provides builder-style API for download options
2026-02-16 07:44:31 +01:00
331bc4136c chore(04-01): add indicatif and futures dependencies
- Added indicatif 0.18 for progress bars
- Added futures 0.3 for async stream handling
2026-02-16 07:43:55 +01:00
70e216d0b4 docs(roadmap): mark Phase 4 plans as created 2026-02-16 07:41:59 +01:00
b2ac0a82a6 docs(phase-4): add download pipeline plans (4 plans, 4 waves) 2026-02-16 07:41:42 +01:00
045d09ef53 docs(04): create download pipeline phase plans 2026-02-16 07:37:57 +01:00
4fcc3fce79 docs(phase-4): research download pipeline implementation 2026-02-16 07:32:46 +01:00
ff3ecb37c0 fix(twitter): implement real GraphQL API for tweets and users 2026-02-15 22:05:16 +01:00
390cf67b9a fix(03): implement Instagram API calls for posts, profiles, stories 2026-02-15 22:01:52 +01:00
15560e9bd9 fix(03): implement DeviantArt API calls for gallery, artwork, subdomain 2026-02-15 21:59:27 +01:00
56a9b9a903 fix(03): implement Pixiv API calls for user works, artwork, series 2026-02-15 21:57:30 +01:00
4400042f7c docs(03-03): complete Pixiv/DeviantArt extractors plan
- Created SUMMARY.md with task commits and metrics
- Created USER-SETUP.md with OAuth configuration steps
- Updated STATE.md with progress (3/6 plans complete)

Task commits:
- 9279a0c: feat(03-03): create Pixiv extractor with OAuth auth
- dcfa62d1: feat(03-03): create DeviantArt extractor with OAuth auth
- 371d4233: feat(03-03): register Pixiv and DeviantArt extractors
2026-02-15 21:44:57 +01:00
371d42330b feat(03-03): register Pixiv and DeviantArt extractors
- Added mod declarations for pixiv and deviantart modules
- Registered both extractors in register: art_all()
- Orderstation -> instagram -> twitter -> pixiv -> deviantart -> example -> generic
- All 6 platform-specific extractors now registered (artstation, instagram, twitter, pixiv, deviantart, generic)
2026-02-15 21:42:04 +01:00
dcfa62d163 feat(03-03): create DeviantArt extractor with OAuth auth
- Created DeviantArtExtractor implementing Extractor trait
- Supports user subdomains and artwork URL patterns
- OAuth token handling with client credentials support
- Uses DeviantArt API v1 and Eclipse API
- Pattern: https://username.deviantart.com, /username/art/title
- 7 unit tests for pattern matching, URL parsing, auth checks
- Fixed regex to properly distinguish subdomain vs artwork URLs
2026-02-15 21:41:59 +01:00
9279a0c408 feat(03-03): create Pixiv extractor with OAuth auth
- Created PixivExtractor implementing Extractor trait
- Supports user, artwork, and series URL patterns
- OAuth token handling with refresh token support
- Uses Pixiv mobile App API (app-api.pixiv.net)
- Pattern: https://www.pixiv.net/users/{id}, /artworks/{id}, /series/{id}
- 10 unit tests for pattern matching, URL parsing, auth checks
2026-02-15 21:41:52 +01:00
d5e802d261 docs(03-02): complete Instagram and Twitter extractor plan
- Created 03-02-SUMMARY.md with execution details
- Created 03-02-USER-SETUP.md with cookie configuration instructions
- Updated STATE.md with plan completion status (2/6)
2026-02-15 21:28:34 +01:00
2beca80eb6 feat(03-02): register Instagram and Twitter extractors in global registry
- Add mod declarations for instagram and twitter modules
- Register InstagramExtractor and TwitterExtractor in register_all()
- Extractors registered before generic fallback for proper routing
- All 72 tests pass
2026-02-15 21:25:57 +01:00
efd7b6d493 feat(03-02): create Twitter/X extractor with cookie-based auth
- Implements Extractor trait for Twitter/X URLs
- Supports user profile and tweet status URL patterns
- Cookie-based authentication via auth_token cookie
- CSRF token extraction from ct0 cookie
- GraphQL API response structures for parsing tweets
- Regex pattern matching for both twitter.com and x.com
2026-02-15 21:21:36 +01:00
b3514e931d feat(03-02): create Instagram extractor with cookie-based auth
- Implements Extractor trait for Instagram URLs
- Supports profile, post, story, and highlight URL patterns
- Cookie-based authentication via sessionid cookie
- GraphQL API response structures for parsing media
- Regex pattern matching for URL routing
- Rate limiting support (6-12 second intervals recommended)
2026-02-15 21:20:27 +01:00
1ea38ff72b docs(03-01): complete ArtStation and Generic extractors plan
- Create SUMMARY.md with execution details
- Update STATE.md with progress and decisions
- Phase 3 now 1/6 plans complete
2026-02-15 21:18:13 +01:00
7b48ecea9a feat(03-01): register ArtStation and Generic extractors
- Add module declarations for artstation and generic
- Register artstation extractor before generic for pattern priority
- Generic extractor registered last as catch-all fallback
- Add HttpClientError to ExtractorError conversion
2026-02-15 21:16:16 +01:00
0cf972e31c feat(03-01): create generic fallback extractor
- Implements catch-all Extractor for any HTTP/HTTPS URL
- Extracts images from img src, srcset, and link hrefs
- Converts relative URLs to absolute using base URL
- Filters by common image extensions and URL patterns
- Serves as last-resort fallback in registry
2026-02-15 21:16:11 +01:00
e2b593ccc3 feat(03-01): create ArtStation extractor
- Implements Extractor trait for artstation.com URLs
- Supports profile, project, and artwork URL patterns
- Extracts images via HTML parsing with src, srcset, and data-src
- Includes 2-second rate limiting delay
- Adds HttpClientError conversion to ExtractorError
2026-02-15 21:16:06 +01:00
cc4becf132 docs(phase-3): create 3 plans for major site extractors
- Plan 1: ArtStation + Generic Fallback (no auth)
- Plan 2: Instagram + Twitter/X (cookie auth)
- Plan 3: Pixiv + DeviantArt (OAuth auth)

Each plan implements site-specific extractors following the existing
Extractor trait pattern. ArtStation is simplest (no auth required),
others require varying levels of authentication.
2026-02-15 21:07:39 +01:00
828923ca7d docs(03): research major site extractors implementation 2026-02-15 21:03:01 +01:00
125bee8b7b docs(phase-2): complete extraction framework phase 2026-02-15 20:57:50 +01:00
c5d07ab221 docs(02-04): complete extractor initialization flow plan
- Created SUMMARY.md with task details and metrics
- Updated STATE.md to reflect Plan 4 completion (4/5 complete)
- Added key decision: Arc::make_mut pattern for mutable extractor access
2026-02-15 20:55:12 +01:00
b04102f02e fix(02-04): implement extractor initialization flow in main.rs
- Fixed ExtractorMatch to use optional regex_match (removes 'static lifetime issue)
- Added Arc::make_mut to get mutable extractor access
- Added initialize() call with ExtractorMatch containing URL
- Added items() call to extract messages after initialization
- CLI now outputs extracted URLs when run with matching URL

Verification: cargo run -- https://example.com/gallery/123 outputs 3 sample URLs
2026-02-15 20:53:40 +01:00
eaa10ef64e fix(02-04): gap closure - fix extractor initialization flow in main.rs 2026-02-15 20:50:17 +01:00
2d67fff3d3 docs(02-03): complete CLI integration plan
- Created SUMMARY.md documenting 4 tasks completed
- Updated STATE.md to reflect 3/5 plans complete in Phase 2
- Added key decisions and completed work to context
2026-02-15 20:43:47 +01:00
7ccee61865 feat(02-03): verify framework end-to-end
- Built release binary successfully
- Tested CLI with example URL: extractor found correctly
- Tested CLI with unknown URL: shows helpful error with supported patterns
- All tests pass
2026-02-15 20:42:12 +01:00
cecc39fa3e feat(02-03): integrate extractor into CLI main
- Added Extractor trait clone_extractor() method for Box<dyn Extractor> clone support
- Added Clone and clone_extractor to ExampleExtractor
- Fixed extractor registry to return shared extractors via Arc
- Updated main.rs to register extractors and find extractor by URL
- Added CLI integration showing found extractor or available patterns
- Added extract_url_array standalone function for tests
2026-02-15 20:38:47 +01:00
e034639a1d fix(02-03): fix extractor registry to return shared extractors
- Changed RegisteredExtractor to use Arc<Box<dyn Extractor>> for shared access
- Added SharedExtractor type alias
- Updated find() to return Result<SharedExtractor, ExtractorError>
- Added get_extractor() as convenience function
- Updated lib.rs exports to include SharedExtractor and get_extractor
2026-02-15 20:29:57 +01:00
6232f67b8f feat(02-03): export extractor module in lib.rs
- Added public re-exports for all key extractor types:
  - Extractor, ExtractorError, ExtractorMatch, DynExtractor
  - Message, MessageKind
  - HttpClient, HttpClientError
  - ExtractorRegistry
  - find, extract, register_all functions
  - HtmlParser, JsonExtractor
- Library users can now access extraction framework directly
2026-02-15 20:26:27 +01:00
f54e6439c1 feat(02-03): create example extractors module
- Added src/extractor/extractors/mod.rs with register_all() function
- Added src/extractor/extractors/example.rs with ExampleExtractor
- ExampleExtractor implements Extractor trait for example.com/gallery URLs
- Demonstrates pattern matching, initialization, and items() extraction
- Updated src/extractor/mod.rs to declare extractors module and export register_all
2026-02-15 20:26:05 +01:00
5bc639008a docs(02-02): complete HTML and JSON extraction utilities plan
- Created HtmlParser with CSS selector support
- Created JsonExtractor with path-based extraction
- Both modules exported from extractor crate
- Updated STATE.md with progress (2/5 plans in Phase 2)
2026-02-15 20:22:16 +01:00