- Added BoardGameGeek back using the new API endpoint suggested by @ppfeister
- Uses https://api.geekdo.com/api/accounts/validate/username?username={} for detection
- errorMsg checks for '"isValid":true' to detect valid usernames
- This approach avoids the previous issues with:
* HTML parsing returning false positives
* User API returning JSON with '[]' substrings that caused detection problems
- Successfully tested with both valid (blue) and invalid usernames
Thanks @ppfeister for the API suggestion and @akh7177 for the initial guidance
- Adds docker-build-test job to regression.yml
- Runs on push/merge to master and release branches
- Extracts VERSION_TAG from pyproject.toml for build
- Tests that Docker image builds and runs successfully
- Resolves dockerfile syntax warnings
- Resolves#2196"
Threads was showing false positives for non-existent users because
the error message detection was incorrect.
Updated errorMsg:
- Old: "<title>Threads</title>" (generic, matches valid pages too)
- New: "<title>Threads • Log in</title>" (specific to non-existent users)
When a user doesn't exist, Threads redirects to a login page with the
title "Threads • Log in". Valid user profiles have titles like
"Username (@username) • Threads, Say more".
Tested with:
- Invalid user (impossibleuser12345): Correctly not found
- Valid user (zuck): Correctly found
This fixes the false positive issue where non-existent Threads profiles
were being reported as found.
BoardGameGeek cannot be reliably detected with Sherlock's current capabilities:
- Original HTML detection: Returns false positives
- API endpoint approach: The API returns status 200 for both valid and invalid users
- Invalid user: Returns exactly '[]'
- Valid user: Returns JSON containing '[]' substrings (e.g., "adminBadges":[])
Since Sherlock's 'message' errorType uses substring matching, it incorrectly
identifies valid users as "not found" when checking for '[]' in the response.
The site's API response format is fundamentally incompatible with Sherlock's
detection methods (message/status_code/response_url), so removal is the only
viable solution to prevent false positives and false negatives.
Addresses false positive issue originally reported in testing.
Using the API endpoint suggested by akh7177:
https://api.geekdo.com/api/users?username={}
However, there's an edge case where valid users contain empty arrays
in their JSON response (adminBadges[], userMicrobadges[], supportYears[])
which causes Sherlock's substring matching to incorrectly flag them
as 'not found' when looking for the '[]' error pattern.
The API correctly returns:
- Valid user: JSON object with user data (but contains [] substrings)
- Invalid user: Exactly '[]' (2 characters total)
This needs further refinement to distinguish between the exact '[]'
response vs JSON containing '[]' substrings.
BoardGameGeek returns identical pages for both existing and non-existing
users, making reliable username detection impossible with HTTP-based
methods. The site likely uses JavaScript to load user-specific content
dynamically.
BoardGameGeek changed from /user/{} to /profile/{} URL structure.
Also updated from message to status_code detection as the site
no longer returns clear error messages for non-existent users.
This fix addresses a critical security vulnerability where HTTP requests
could hang indefinitely, potentially causing denial of service.
Changes:
- Added 10-second timeout to version check API call
- Added 10-second timeout to GitHub pull request API call
- Added 30-second timeout to data file downloads (larger timeout for data)
- Added 10-second timeout to exclusions list download
Impact:
- Prevents infinite hangs that could freeze the application
- Improves user experience with predictable response times
- Fixes security issue flagged by Bandit static analysis (B113)
- Makes the application more robust in poor network conditions
The timeouts are conservative enough to work with slow connections
while preventing indefinite blocking that could be exploited.