24 Commits

Author SHA1 Message Date
Alessio
b9e025589e Update CHANGELOG.txt for v0.6.16 2024-09-03 18:36:23 -07:00
Alessio
c45b8e7ad8 Add --delay flag to force a delay between requests in a large paginated scrape 2024-08-19 18:20:12 -07:00
Alessio
4990e7913d Fix lint error 2024-08-19 17:29:54 -07:00
Alessio
e2ca9a975a Remove scraper singleton pattern entirely 2024-08-19 17:27:54 -07:00
Alessio
91f722b7fa Scraper requests now report invalidated or expired sessions 2024-08-18 16:22:37 -07:00
Alessio
24129c4852 REFACTOR: reduce technical debt, particularly that caused by singleton pattern in pkg/scraper
- ensure all scraper functions have a `api.XYZ` version and a package-level convenience function
	- isolate `the_api` to top-level convenience functions, in preparation for removal
- move a bunch of scraper functions around to be nearby their related functions
- new ErrLoginRequired
- remove obsolete APIv1 stuff (Feed, TweetDetail)
- rename scraper function GetUserFeedGraphqlFor => GetUserFeed
- fix go.mod Go version incorrectly claiming it's compatible with Go 1.16 (should be Go 1.17)
2024-08-09 19:48:50 -07:00
Alessio
8aca12695b Handle media download 404s gracefully 2024-07-28 12:50:00 -07:00
Alessio
ef15e8a306 Handle guest token / session initialization when not connected to internet 2024-07-14 13:20:44 -07:00
Alessio
42bf8ec06a Fix a bug sending empty POST bodies 2024-05-11 10:58:33 -07:00
Alessio
f927507089 Enable marking DMs as read 2024-05-10 22:09:48 -07:00
Alessio
1b3c5d0ed3 Add timeout error handling for scraper requests to the request body download as well (rather than just headers) 2024-04-13 16:10:23 -07:00
ca7cf613f9 Remove unnecessary import 2024-03-18 21:16:38 -07:00
3967367eed Fix more lint errors 2024-03-18 21:15:27 -07:00
69e0a35e57 Handle HTTP request timeouts 2024-03-16 19:55:05 -07:00
73c5803a47 Add downloading of DM embedded images, videos and links 2024-03-11 21:12:38 -07:00
0ad3cf8fb8 Fix lint errors 2024-03-11 14:08:07 -07:00
aa05708e20 Move media downloader from persistence to scraper package; add 429 Rate Limited error type 2024-03-11 12:57:58 -07:00
1ba4f91463 REFACTOR: replace 'log.Debug(fmt.Sprintf(...))' with 'log.Debugf(...)' and remove 'scraper.' prefix in utils_test.go 2024-03-10 19:14:27 -07:00
73ffb90f63 Move API login flow to its own file; add support for secondary verification challenges 2024-03-02 15:43:02 -08:00
8aca7d4ebe Add manual re-scrape for user feeds and quote-tweets stat on tweets 2023-08-27 22:55:40 -03:00
655a47ec21 Remove debugging panic 2023-08-27 22:00:58 -03:00
8349ca4ae3 Add background scraping of the logged-in user's home timeline 2023-08-27 21:05:09 -03:00
eaa01a2360 Fix fetching users and search
- Add is_deleted field on Users
- Fix fetching of tombstoned users including deleted users
- Fix "verified" bluechecks not being scraped anymore
- Fix search to use new graphql endpoint (old one got taken down)
2023-08-22 20:07:32 -03:00
a061decd0f REFACTOR: Rename go module to 'gitlab.com/offline-twitter/twitter_offline_engine' in accordance with 'go get' conventions
- also restructure project to use a 'pkg' directory for reusable packages
2023-07-30 14:20:07 -03:00