Compare commits

...

441 Commits

Author SHA1 Message Date
Paul Pfeister 8f1308b90d
Merge pull request #2758 from Aaditya-Chunekar/patch-2
Add Credly data to JSON resource
2025-12-29 19:54:44 -08:00
Paul Pfeister e856b05c2c
Merge pull request #2636 from simplyNour/Bug/fix-gradle-false-pos-test-failure
Bug: Fix local variable scoping issue affecting false-pos test output
2025-12-29 18:56:30 -08:00
Aaditya fe9e750dab
Add Credly data to JSON resource 2025-11-14 09:27:07 +05:30
Paul Pfeister 842ae1f754
Merge pull request #2733 from Aaditya-Chunekar/patch-1
Add Nothing Community data to data.json
2025-10-29 16:34:10 -07:00
Paul Pfeister 339634f7bc
Merge pull request #2737 from Nolanp123/fix-minecraft-regex
Fix Minecraft False Positives for Long Usernames
2025-10-28 20:47:32 -07:00
Nolan Parker c1632693bb Add regexCheck to Minecraft to prevent false positives for long usernames 2025-10-28 20:39:53 -05:00
Aaditya e19cb32009
Add Nothing Community data to data.json 2025-10-27 11:20:30 +05:30
Paul Pfeister b69c8ef940
Merge pull request #2710 from Aaditya-Chunekar/add-sites
hacktoberfest: Added sites support
2025-10-26 00:16:29 -07:00
Aaditya-Chunekar 2724711060 feat: add tmdb 2025-10-26 09:49:31 +05:30
Paul Pfeister 0a68ab7f4c
Merge pull request #2709 from Aaditya-Chunekar/add-topmate
hacktoberfest: Add topmate.io support
2025-10-24 20:15:02 -07:00
Paul Pfeister 8675178be1
Merge pull request #2705 from Aaditya-Chunekar/add-site-seoforum
hacktoberfest: Add SEO Forum Support
2025-10-24 20:12:50 -07:00
Aaditya-Chunekar 9bafb8a280 feat: add n8n, HackerSploit, Arduino Forum 2025-10-24 09:37:40 +05:30
Aaditya-Chunekar 8e5549862a feat: add topmate.io 2025-10-24 09:14:42 +05:30
Aaditya-Chunekar 8797fcd517 feat: add SEOForum 2025-10-24 08:46:23 +05:30
Paul Pfeister 0995d4d669
chore: reformat 2025-10-23 19:39:05 -04:00
Paul Pfeister 6c0c273a0b
Merge pull request #2695 from simplyNour/Bug/urls-are-not-clickable-in-excel-file
Make urls clickable when saved to excel
2025-10-23 16:25:17 -07:00
Paul Pfeister 3eeba790fd
Merge pull request #2722 from VivekGaddam/Twitch_Added
Added Twitch Platform Support to Sherlock
2025-10-23 15:28:01 -07:00
Paul Pfeister 61a29ec373
Merge pull request #2723 from imhiteshgarg/adding_lemmy
adding lemmy
2025-10-23 15:26:57 -07:00
Paul Pfeister 9fbbbf7c73
Merge pull request #2724 from obiwan04kanobi/feat/add-codolio
feat: add Codolio to supported sites
2025-10-23 15:26:16 -07:00
obiwan04kanobi 331b68d909 feat: add Codolio to supported sites
Add Codolio (coding portfolio tracker) as a new site target for username detection.

Detection method: Message-based using title tag differences
- Existing profiles: '<title>Username | Codolio</title>'
- Non-existing profiles: '<title>Page Not Found | Codolio</title>'

Tested with multiple usernames to confirm accurate detection.
2025-10-23 22:42:06 +05:30
Hitesh Garg 8c3e093561 adding lemmy
adding lemmy
2025-10-23 21:38:18 +05:30
vivekgaddam e35e5e3af1 corrected Twitch 2025-10-23 19:41:00 +05:30
vivekgaddam 906287b305 added twitch 2025-10-23 19:18:31 +05:30
Matheus Felipe 0dbb6abcc5
Fix Minor Capitalization Issue in README.md (#2716) 2025-10-23 09:08:29 -03:00
Matheus Felipe 03e097cc82
Reorder Terraria Forums to correct alphabetical position (#2700) 2025-10-23 08:53:50 -03:00
Matheus Felipe 91c1964918
Add GameFaqs support (#2721)
Co-authored-by: Maquinero123456 <jimenanavarrodavid@uma.es>
2025-10-23 08:04:41 -03:00
Matheus Felipe 373f3d389a
Added support for Trovo (#2720) 2025-10-23 06:17:28 -03:00
SirAzako 828c47109d
Added support for Trovo 2025-10-23 06:10:20 -03:00
Matheus Felipe 94245b25df
Add OpenGameArt support (#2719)
Co-authored-by: Horațiu Mlendea <Horatiu.Mlendea@ProtonMail.com>
2025-10-23 05:03:35 -03:00
Matheus Felipe 734542f0af
Add mstdn.social (#2718) 2025-10-23 04:19:10 -03:00
Matheus Felipe 1f8166ba9f
Remove unclaimed username entry for mstdn.social 2025-10-23 03:41:21 -03:00
MagicLike 6f1ddaa615
Added mstdn.social
Added another Mastodon instance: mstdn.social
2025-10-23 03:32:54 -03:00
Nolan Parker 7ee2891517 Fix Minor Capitalization Issue in README.md 2025-10-22 22:16:13 -05:00
Paul Pfeister b893e4aa20
Merge pull request #2711 from imhiteshgarg/add_observablehq
Adding ObservableHQ site
2025-10-21 23:04:24 -07:00
Hitesh Garg eff869906a Adding ObservableHQ site
Adding ObservableHQ site
2025-10-22 10:58:31 +05:30
Paul Pfeister 2a0107e189
Merge pull request #2702 from ABSCP4/patch-1
Update README.md
2025-10-20 15:33:36 -07:00
ABSCP4 5d8c4de212
Update README.md
fixed typo
2025-10-20 11:01:32 -07:00
Nolan Parker 1f9d7e8373 Reorder Terraria Forums to correct alphabetical position 2025-10-19 15:53:09 -05:00
Paul Pfeister 184470f871
Merge pull request #2699 from Nolanp123/fix-codesandbox-name
Fix site name formatting for CodeSandbox
2025-10-19 13:14:14 -07:00
Nolan Parker 342dbc85cc Fix site name formatting for CodeSandbox 2025-10-19 14:44:47 -05:00
Paul Pfeister 457e16e84f
Merge pull request #2670 from simplyNour/Bug/fix-false-positive-for-topcoder
fix: false positive for Topcoder
2025-10-18 23:47:34 -07:00
Paul Pfeister 43b3736b75
Merge pull request #2697 from raman1236/add-odysee-support
Add Odysee support
2025-10-18 23:06:15 -07:00
Paul Pfeister 64a49ffe17
Merge pull request #2698 from KaiAllAlone/KaiAllAlone-warframe-market
Add Warframe Market support
2025-10-18 22:48:00 -07:00
rvasikarla 0afd2006c6 Add Odysee support
- Add Odysee platform to sherlock database- Uses canonical link detection for non-existent users- URL pattern: https://odysee.com/@\{username\}- Detects error via canonical redirect to main site
2025-10-18 16:47:27 -05:00
rvasikarla 3c270173a7 Add Odysee support
- Add Odysee platform to sherlock database- Uses canonical link detection for non-existent users- URL pattern: https://odysee.com/@\{username\}- Detects error via canonical redirect to main site
2025-10-18 16:44:10 -05:00
rvasikarla 8d73f9ef4c Add Odysee support
- Add Odysee platform to sherlock database- Uses canonical link detection for non-existent users- URL pattern: https://odysee.com/@\{username\}- Detects error via canonical redirect to main site
2025-10-18 16:37:31 -05:00
Debanuj Roy 472c086805
Update data.jsonfixed syntax error 2025-10-19 03:06:25 +05:30
Debanuj Roy 400c277f24
more robust 2025-10-19 03:00:43 +05:30
Debanuj Roy e759564550
Update data.jsonupdate matching logic 2025-10-19 02:55:33 +05:30
Debanuj Roy deebe7137c
Added Warframe Market 2025-10-19 02:45:07 +05:30
nour cb14ccbaaf Make urls clickable when saved to excel 2025-10-18 15:21:36 +03:00
Paul Pfeister eb892795e9
Merge pull request #2683 from 403Code/patch-1
Add: Cfx.re Forum
2025-10-15 10:52:32 -07:00
Rizey (Nantaaaaaaaaaa) 09de90066b
Update data.json 2025-10-15 13:39:44 +07:00
Rizey (Nantaaaaaaaaaa) cd1f27c12b
Update data.json 2025-10-15 13:29:42 +07:00
Rizey (Nantaaaaaaaaaa) b837de8358
Add Cfx.re Forum 2025-10-15 13:22:09 +07:00
Paul Pfeister 7a70f35883
Merge pull request #2680 from bjornmorten/add/norwegian-forums
Add Norwegian forum sites (diskusjon.no & forum.kvinneguiden.no)
2025-10-14 11:25:31 -07:00
bjornmorten 4b17dae385
fix: regex max length for kvinneguiden 2025-10-14 19:48:02 +02:00
Paul Pfeister efefe3f54a
Merge pull request #2682 from bjornmorten/add/cryptohack
Add: CryptoHack
2025-10-14 10:41:41 -07:00
Paul Pfeister 4b70a1fc25
Merge pull request #2681 from bjornmorten/add/hackmd
Add: HackMD
2025-10-14 10:41:31 -07:00
bjornmorten a7893f399e add: CryptoHack 2025-10-14 19:28:53 +02:00
bjornmorten 1cb6c12851 add: HackMD 2025-10-14 19:21:36 +02:00
bjornmorten c4f7485ecf fix: alphabetical ordering 2025-10-14 19:10:57 +02:00
bjornmorten 228f50413e add: diskusjon.no and forum.kvinneguiden.no 2025-10-14 19:08:35 +02:00
Paul Pfeister d1867b1b51
Merge pull request #2679 from aryanj10/fix-fasle-positive-for-lesswrong
Fix LessWrong detection Issue #2634
2025-10-14 09:58:56 -07:00
Aryan Jain 6d2497582e Fix LessWrong detection Issue #2634 2025-10-14 11:04:15 -04:00
Paul Pfeister 885c43b8af
Merge pull request #2677 from spmedia/patch-9
Add: BreachSta.rs Forum
2025-10-13 16:12:36 -07:00
Edmond Major III 8ad47b0b23
Update data.json 2025-10-13 17:23:10 -05:00
Edmond Major III e93af99424
Update data.json
remix based off title instead of text in body
2025-10-13 17:20:50 -05:00
Edmond Major III 5862ab4f92
Update data.json
Add in BreachSta.rs forum - a popular cybercrime forum

https://breachsta.rs/profile/Sleepybubble - returns valid profile

https://breachsta.rs/profile/asdfasdfasdf - returns "Not found
This page doesn't exist"
2025-10-13 17:15:26 -05:00
Paul Pfeister 4110cac45c
Merge pull request #2661 from KaiAllAlone/terraria-forums
Site Added:Terraria forums
2025-10-13 15:07:31 -07:00
Paul Pfeister d66b18e8ae
Merge pull request #2676 from spmedia/patch-8
Add: Patched.sh
2025-10-13 14:53:19 -07:00
Edmond Major III b532fc6a38
Add: Patched.sh
Add Patched, a popular cybercrime forum.

https://patched.sh/User/blue = valid user

https://patched.sh/User/khjasjkdhfa38a = not a valid user and displays "The member you specified is either invalid or doesn't exist."
2025-10-13 13:20:03 -05:00
Paul Pfeister 99cf073835
Merge pull request #2674 from spmedia/patch-6
Add: Cracked.sh
2025-10-13 10:41:46 -07:00
Edmond Major III ec7e1b8b81
Update data.json
Trailing / was the issue so removed it
2025-10-13 12:30:50 -05:00
Edmond Major III a4aab38901
Update data.json
Remove www
2025-10-13 12:24:02 -05:00
Edmond Major III 5202900618
Update data.json
Updated error msg on no user
2025-10-13 12:16:09 -05:00
Edmond Major III 26444a98ad
Update data.json
Add Cracked.sh - a popular skid hacker website

Examples of profiles:

Claimed: https://cracked.sh/Blue - gives status code of 200

Unclaimed: https://cracked.sh/noonewouldeverusethis7 - gives status code of 404
2025-10-13 12:12:43 -05:00
Paul Pfeister bced3242f3
Merge pull request #2668 from simplyNour/Bug/fix-false-positive-for-hackerearth
fix:  false positive for hackerearth
2025-10-13 10:03:00 -07:00
Paul Pfeister 08aabdad76
Merge pull request #2673 from simplyNour/Deprecate/pepper-site-is-no-longer-operating
Deprecate: Pepper.it closed its doors on August2025
2025-10-13 10:00:45 -07:00
Paul Pfeister 170ee0b928
Merge branch 'master' into Deprecate/pepper-site-is-no-longer-operating 2025-10-13 09:58:47 -07:00
Paul Pfeister 2c9a54438a
Merge pull request #2672 from simplyNour/Feature/add-pepper-global-sites
Feat: Add pepper stores worldwide websites
2025-10-13 09:57:36 -07:00
nour 84f4886809 Feat: Add pepper stores worldwide websites 2025-10-13 17:46:38 +03:00
nour e26fd6b643 Fix: false positive for topcoder due to invalid regex 2025-10-13 16:27:02 +03:00
Paul Pfeister ce5de20f80
Merge pull request #2659 from faizan842/re-enable-opencollective-powershell-realmeye
Re-enable OpenCollective and Realmeye
2025-10-12 19:01:46 -07:00
Paul Pfeister 3ff2d135b5
Merge branch 'master' into re-enable-opencollective-powershell-realmeye 2025-10-12 18:58:04 -07:00
Paul Pfeister 1e65b4a209
Merge pull request #2657 from KaiAllAlone/patch-1
Add Pokemon Forums
2025-10-12 18:55:13 -07:00
Debanuj Roy db3545b7b0
Added more robust message 2025-10-12 16:31:27 +05:30
Debanuj Roy 1898a0c4a9
Add Terraria Forums 2025-10-12 16:27:30 +05:30
Faizan Habib 0d32357b10 Re-enable OpenCollective and Realmeye
- Updated OpenCollective to use status_code detection (previously used message detection)
- Added Realmeye with message detection

Both sites were previously removed due to false positives but have been verified to work correctly now:
- OpenCollective: Returns 200 for existing profiles, 404 for non-existent
- Realmeye: Shows 'Sorry, but we either:' error message for non-existent players

Tested with known usernames:
- OpenCollective: sindresorhus
- Realmeye: rotmg

Note: PowerShell Gallery was initially included but removed after discovering their /profiles/ endpoint no longer works.
2025-10-12 13:57:22 +05:30
Debanuj Roy 1be2abb056
Resolved wrong urlMain 2025-10-12 13:39:55 +05:30
Debanuj Roy fb392534ef
Add Pokemon Forums 2025-10-12 08:03:23 +05:30
Paul Pfeister bd49aac9d1
Merge pull request #2606 from Fandroid745/fix/babyru-false-positive
fix: Add error messages to BabyRu to prevent false positives
2025-10-11 18:10:54 -04:00
Matheus Felipe 94838863fd
Cleanup site-list.py (#2307) 2025-10-11 15:30:08 -03:00
Matheus Felipe 79973a58ea
Update file handling to include encoding and correct comments 2025-10-11 15:21:36 -03:00
Fandroid745 b9a72b55ca fix: use Unicode escape sequences for BabyRu error messages 2025-10-11 23:14:43 +05:30
Paul Pfeister ef55f7ddd3
chore: reformat json 2025-10-11 13:34:45 -04:00
Paul Pfeister 28b78e7ddd
Merge pull request #2633 from VivekGaddam/add-tiktok-support
Add TikTok (tiktok.com) to supported sites
2025-10-11 13:33:39 -04:00
Paul Pfeister d2072e2cac
chore: rem tiktok for improved rev 2025-10-11 13:32:51 -04:00
Paul Pfeister 3edb73cb23
Merge pull request #2650 from Nirzak/patch-1
Added classifiers for supported python version
2025-10-11 13:30:20 -04:00
Paul Pfeister 6d1280ee9d
Merge pull request #2651 from aryanj10/add-tiktok-pinterest
Added support for TikTok & Pinterest
2025-10-11 13:12:13 -04:00
Dhanush Sugganahalli 0c457e590a
Merge branch 'master' into fix/babyru-false-positive 2025-10-11 21:24:18 +05:30
Aryan Jain dc307fc0fd feat: add TikTok and Pinterest site detection support 2025-10-11 10:34:48 -04:00
Nirjas Jakilim d6256e9fc6
classifiers for supported python version 2025-10-11 20:27:27 +06:00
Aryan Jain 1645828527 Add TikTok site support 2025-10-11 09:25:00 -04:00
Matheus Felipe e774b08dc5
Add imood.com support (#2647) 2025-10-11 09:28:06 -03:00
Matheus Felipe 99067b2e59
Add imood.com support
resolve #2646
2025-10-11 09:23:52 -03:00
nour f039b50c4e Deprecate: Pepper closed its doors on August 14th 2025. 2025-10-11 08:29:32 +03:00
nour 7d5bd97142 fix: false positive for hackerearth 2025-10-11 07:17:01 +03:00
vivekgaddam 70b5055631 corrected india F+ prevent 2025-10-11 08:54:40 +05:30
Paul Pfeister 1be25e70df
Merge pull request #2621 from MaxwellOldshein/feat/validate-remote-manifest-with-local-schema-before-validate-target-test-suite
feat: GitHub Actions - Validate Remote Manifest Against Local Schema Before Running Validate Modified Targets Test Suite
2025-10-10 20:41:58 -04:00
Paul Pfeister 9000575f7c
Merge pull request #2631 from simplyNour/Add-Vjudge-Support-to-Sherlock
Add Vjudge to the sites source
2025-10-10 20:38:16 -04:00
Paul Pfeister 220ebf935c
Merge pull request #2640 from sctech-tr/patch-1
add status cafe (status.cafe)
2025-10-10 20:22:44 -04:00
sctech 959c4a2b26
change method for status.cafe 2025-10-10 20:38:08 +03:00
sctech 443d43df21
add status cafe 2025-10-10 20:09:45 +03:00
Paul Pfeister 80080cd57c
Merge pull request #2638 from simplyNour/Bug/fix-false-positive-for-kaskus 2025-10-10 12:51:15 -04:00
nour 80922a93fa fix: false positive for kaskus 2025-10-10 18:53:28 +03:00
nour 45494fc74b bug: fix local variable scoping issue in test validate targets 2025-10-10 06:29:55 +03:00
nour d92e2339a1 feat: add vjudge 2025-10-10 05:28:28 +03:00
vivekgaddam 659bf92d99 corrected the errorMsg 2025-10-09 19:50:43 +05:30
vivekgaddam 3e4d9bcd85 Add TikTok support to Sherlock 2025-10-09 17:57:15 +05:30
Matheus Felipe d3076cdfe0
Add Ifunny (#2632) 2025-10-09 09:16:41 -03:00
Derick Kunz 51436cefe8
Add Ifunny 2025-10-09 08:51:13 -03:00
Paul Pfeister 08a8177286
Merge pull request #2610 from eslteacher902010/add-musescore-clean 2025-10-09 06:19:35 -04:00
Paul Pfeister e6d5fd64e0
Merge pull request #2622 from akh7177/Add-support-for-Discord.bio
Add support for Discord.bio
2025-10-08 13:03:57 -04:00
Abhyuday K Hegde ac9f3a7fd5
Add support for Discord.bio 2025-10-08 11:21:53 +05:30
Paul Pfeister 289ab28b98
Merge pull request #2576 from obiwan04kanobi/add-aws-skills-profile-site
Add AWS Skills Profile site to Sherlock
2025-10-07 19:46:54 -04:00
Maxwell Oldshein 46ad6c9a5e Fix whitespace. 2025-10-07 14:53:47 -04:00
Maxwell Oldshein d20dcbe8db Retain original whitespace 2025-10-07 14:52:53 -04:00
Maxwell Oldshein 70c3c84196 Update validation logic placement in workflow 2025-10-07 14:50:54 -04:00
Dhanush Sugganahalli 53840c6a98
Merge branch 'master' into fix/babyru-false-positive 2025-10-07 14:41:12 +05:30
Fandroid745 068fff8711 fix:Remove regexCheck field and changed encoding to UTF-8 2025-10-07 14:33:32 +05:30
Maxwell Oldshein 5735d01804 Validate remote manifest against local schema 2025-10-06 23:52:14 -04:00
Paul Pfeister f60de0d8f8
Merge pull request #2616 from akh7177/Add-new-sites-to-data.json 2025-10-06 13:39:04 -04:00
Paul Pfeister cb3ab91492
Merge pull request #2485 from manjushsh/code-sandbox 2025-10-06 13:30:10 -04:00
paul_kniaz 4eea79ed6a MuseScore: use GET for status_code via request_method to avoid 403 on HEAD 2025-10-06 13:07:45 -04:00
Abhyuday K Hegde 03c051a525
Add new sites to Sherlock 2025-10-06 18:47:38 +05:30
Aniket eccdf80b95
Add Pronouns.page (#2419)
* Add support for Pronouns.page (#2418)

* Update the url
2025-10-06 09:52:56 -03:00
Manjush Shetty eb51bf9b1a misc: remove isnsfw from hive 2025-10-06 17:15:44 +05:30
Manjush Shetty 5d7b438fd6 add urlProbe 2025-10-06 17:11:50 +05:30
Manjush Shetty ef0b97fb57 chore: try with api instead 2025-10-06 16:54:07 +05:30
Manjush Shetty c6c3522159 chore: add custom regex for codesandbox usernames 2025-10-06 16:45:53 +05:30
Manjush Shetty 2908c8eaa8 chore: try with different message 2025-10-06 16:40:59 +05:30
Manjush S f05b8e0ed6
Merge branch 'sherlock-project:master' into code-sandbox 2025-10-06 16:21:40 +05:30
Fandroid745 01bca6b39f fix: corrected the regexCheck field value to an empty string 2025-10-06 08:57:11 +05:30
Paul Pfeister d2835e56a4
Merge pull request #2568 from shreyasNaik0101/fix/remediate-blitztactics
fix(sites): Remediate false positive for Blitz Tactics
2025-10-05 14:17:43 -04:00
shreyasNaik0101 0cf110e69e
Merge branch 'master' into fix/remediate-blitztactics 2025-10-05 22:56:59 +05:30
Paul Pfeister a88adb0488
Merge pull request #2559 from frogtheastronaut/master
Removed duplicate Bluesky entry in data.json
2025-10-05 13:23:53 -04:00
Fandroid745 4010a58dde fix: changed the username_claimed to example placeholder 2025-10-05 22:23:17 +05:30
Paul Pfeister b9e28b9b23
Merge pull request #2588 from shreyasNaik0101/fix/correct-ci-diff
fix(ci): Use merge-base for correct target validation
2025-10-05 12:49:58 -04:00
Paul Pfeister d0e005da23
Merge pull request #2609 from akh7177/Add-support-for-WakaTime
Add support for WakaTime
2025-10-05 12:30:24 -04:00
paul_kniaz 7a4f19e6b3 Fix MuseScore URL endpoint 2025-10-05 12:27:30 -04:00
paul_kniaz f958e7b96f update MuseScore username_claimed to arrangeme (valid profile) 2025-10-05 12:13:37 -04:00
paul_kniaz 4c99bf3b75 Add MuseScore site (clean version) 2025-10-05 10:44:55 -04:00
Fandroid745 e3066a1d7a fix:added the username_claimed field 2025-10-05 18:59:04 +05:30
Abhyuday K Hegde f0510a169a
Add support for WakaTime 2025-10-05 15:52:56 +05:30
manjushsh 738df6c362 chore: add error message to the codesandbox 2025-10-05 15:22:37 +05:30
Paul Pfeister 83a38db110
Merge pull request #2582 from dollaransh17/fix/boardgamegeek-false-positive
fix(sites): Update BoardGameGeek URL structure and detection method
2025-10-05 02:39:29 -04:00
dollaransh17 9e3448d992 fix(sites): So , Implemented BoardGameGeek using username validation API
- Added BoardGameGeek back using the new API endpoint suggested by @ppfeister
- Uses https://api.geekdo.com/api/accounts/validate/username?username={} for detection
- errorMsg checks for '"isValid":true' to detect valid usernames
- This approach avoids the previous issues with:
  * HTML parsing returning false positives
  * User API returning JSON with '[]' substrings that caused detection problems
- Successfully tested with both valid (blue) and invalid usernames

Thanks @ppfeister for the API suggestion and @akh7177 for the initial guidance
2025-10-05 11:59:41 +05:30
shreyasNaik0101 70e3c0ddd8 fix(ci): Address review feedback for correctness and efficiency 2025-10-05 11:00:14 +05:30
Fandroid745 017c08a45d fix: Add error messages to BabyRu to prevent false positives 2025-10-05 10:53:59 +05:30
Paul Pfeister f32f4ffaee
Merge pull request #2595 from obiwan04kanobi/feature/issue-2196-ci-docker-build-test
Add Docker build test to CI workflow (#2196)
2025-10-04 21:09:04 -04:00
Paul Pfeister 7379ba7b19
Merge branch 'remove-tor' 2025-10-04 20:52:40 -04:00
Paul Pfeister 3aeb6d6356
Merge pull request #2602 from sherlock-project/feat/no-txt
chore: make default --no-txt
2025-10-04 20:36:33 -04:00
Paul Pfeister 4246a7b16f
chore: make default --no-txt
Workflows where a txt file is still required should use --txt
2025-10-04 20:32:16 -04:00
Paul Pfeister e44fe49c8f
Merge pull request #2601 from sherlock-project/feat/graceful-skip
feat: gracefully skip sites with invalid errorType
2025-10-04 20:23:07 -04:00
Paul Pfeister 52cd5fdfc1
feat: gracefully skip sites with invalid errorType 2025-10-04 20:22:34 -04:00
Paul Pfeister 947f1ad2b6
Merge pull request #2574 from dollaransh17/fix/http-request-timeouts
Security Fix: Add timeout parameters to HTTP requests
2025-10-04 18:42:13 -04:00
shreyasNaik0101 4d00884d8c fix(ci): Implement secure diff logic per feedback 2025-10-05 03:00:21 +05:30
Paul Pfeister cfcc82aaca
Merge pull request #2597 from sherlock-project/feat/multiple-types
Support multiple errorType checks
2025-10-04 17:21:26 -04:00
Paul Pfeister 0794e02b52
feat: support multiple errorTypes 2025-10-04 16:53:30 -04:00
Paul Pfeister 975965abed
Merge pull request #2589 from dollaransh17/fix/threads-false-positive
fix(sites): Fix Threads false positive detection
2025-10-04 15:44:04 -04:00
Paul Pfeister a678bed154
Merge pull request #2587 from akh7177/remediate-cyberdefenders-fp
fix(sites):  Remediate False Positives for CyberDefenders
2025-10-04 15:43:48 -04:00
Paul Pfeister 4ec6f1eec0
Merge pull request #2585 from akh7177/remediate-slideshare-fp
fix(sites):  Remediate False Positive for SlideShare
2025-10-04 15:43:36 -04:00
Paul Pfeister d1527376e7
Merge pull request #2584 from akh7177/remediate-roblox-fp
fix(sites):  Remediate False Positive for Roblox
2025-10-04 15:43:29 -04:00
obiwan04kanobi b99719ce60 Add Docker build test to CI workflow
- Adds docker-build-test job to regression.yml
- Runs on push/merge to master and release branches
- Extracts VERSION_TAG from pyproject.toml for build
- Tests that Docker image builds and runs successfully
- Resolves dockerfile syntax warnings
- Resolves #2196"
2025-10-05 00:22:12 +05:30
dollaransh17 dc869852bc fix(sites): Fix Threads false positive detection
Threads was showing false positives for non-existent users because
the error message detection was incorrect.

Updated errorMsg:
- Old: "<title>Threads</title>" (generic, matches valid pages too)
- New: "<title>Threads • Log in</title>" (specific to non-existent users)

When a user doesn't exist, Threads redirects to a login page with the
title "Threads • Log in". Valid user profiles have titles like
"Username (@username) • Threads, Say more".

Tested with:
- Invalid user (impossibleuser12345): Correctly not found
- Valid user (zuck): Correctly found

This fixes the false positive issue where non-existent Threads profiles
were being reported as found.
2025-10-04 17:22:50 +05:30
shreyasNaik0101 3079e7a218 fix(ci): Use merge-base for correct target validation 2025-10-04 15:25:30 +05:30
Abhyuday K Hegde 5cd769c2f4
Remediate False Positives for CyberDefenders 2025-10-04 15:12:20 +05:30
Abhyuday K Hegde 977ad5c1a4
Remediate False Positive for SlideShare 2025-10-04 14:48:37 +05:30
Abhyuday K Hegde 57a0ccef38
Remediate False Positive for Roblox 2025-10-04 14:30:40 +05:30
dollaransh17 94c013886a fix(sites): Remove BoardGameGeek due to incompatible detection
BoardGameGeek cannot be reliably detected with Sherlock's current capabilities:

- Original HTML detection: Returns false positives
- API endpoint approach: The API returns status 200 for both valid and invalid users
  - Invalid user: Returns exactly '[]'
  - Valid user: Returns JSON containing '[]' substrings (e.g., "adminBadges":[])

Since Sherlock's 'message' errorType uses substring matching, it incorrectly
identifies valid users as "not found" when checking for '[]' in the response.

The site's API response format is fundamentally incompatible with Sherlock's
detection methods (message/status_code/response_url), so removal is the only
viable solution to prevent false positives and false negatives.

Addresses false positive issue originally reported in testing.
2025-10-04 11:33:27 +05:30
dollaransh17 c5e209d78e fix(sites): Implement BoardGameGeek API detection as suggested
Using the API endpoint suggested by akh7177:
https://api.geekdo.com/api/users?username={}

However, there's an edge case where valid users contain empty arrays
in their JSON response (adminBadges[], userMicrobadges[], supportYears[])
which causes Sherlock's substring matching to incorrectly flag them
as 'not found' when looking for the '[]' error pattern.

The API correctly returns:
- Valid user: JSON object with user data (but contains [] substrings)
- Invalid user: Exactly '[]' (2 characters total)

This needs further refinement to distinguish between the exact '[]'
response vs JSON containing '[]' substrings.
2025-10-04 11:23:55 +05:30
dollaransh17 3e653c46b0 fix(sites): Remove BoardGameGeek - unreliable detection
BoardGameGeek returns identical pages for both existing and non-existing
users, making reliable username detection impossible with HTTP-based
methods. The site likely uses JavaScript to load user-specific content
dynamically.
2025-10-04 03:12:47 +05:30
dollaransh17 91f3b16993 fix(sites): Update BoardGameGeek URL structure and detection method
BoardGameGeek changed from /user/{} to /profile/{} URL structure.
Also updated from message to status_code detection as the site
no longer returns clear error messages for non-existent users.
2025-10-04 02:55:57 +05:30
obiwan04kanobi 0f3df0f4da **PR description:**
This PR adds AWS Skills Profile to Sherlock’s supported sites in data.json. The configuration uses a unique substring (`shareProfileAccepted":false`) for reliable detection of non-existent usernames, addressing the challenge of JavaScript-rendered error messages.
- Site details and detection logic follow Sherlock’s contributing guidelines and Code of Conduct.
- No changes to core logic; only a new site entry.
- Reviewed for schema compliance and duplicate key cleanup as noted.
2025-10-03 13:46:53 +05:30
dollaransh17 0e7219b191 Security Fix: Add timeout parameters to HTTP requests
This fix addresses a critical security vulnerability where HTTP requests
could hang indefinitely, potentially causing denial of service.

Changes:
- Added 10-second timeout to version check API call
- Added 10-second timeout to GitHub pull request API call
- Added 30-second timeout to data file downloads (larger timeout for data)
- Added 10-second timeout to exclusions list download

Impact:
- Prevents infinite hangs that could freeze the application
- Improves user experience with predictable response times
- Fixes security issue flagged by Bandit static analysis (B113)
- Makes the application more robust in poor network conditions

The timeouts are conservative enough to work with slow connections
while preventing indefinite blocking that could be exploited.
2025-10-03 13:41:43 +05:30
Paul Pfeister 1d2c4b134f
Merge pull request #2570 from shreyasNaik0101/fix/remediate-applediscussions
fix(sites): Remediate false positive for Apple Discussions
2025-10-02 20:30:57 -04:00
shreyasNaik0101 b245c462c9 fix(sites): Remediate false positive for Apple Discussions 2025-10-03 05:56:52 +05:30
shreyasNaik0101 876e58b159 fix(sites): Remediate false positive for Blitz Tactics 2025-10-03 05:45:43 +05:30
Paul Pfeister 66d9733da7
Merge pull request #2565 from shreyasNaik0101/fix/remediate-mydramalist
fix(sites): Remediate false positive for Mydramalist
2025-10-02 19:40:47 -04:00
Paul Pfeister c55deab3a2
Merge pull request #2561 from shreyasNaik0101/fix/remediate-deviantart
fix(sites): Remediate false positive for DeviantArt
2025-10-02 19:37:00 -04:00
Paul Pfeister edcb697793
Merge pull request #2564 from shreyasNaik0101/fix/remediate-allmylinks
fix(sites): Remediate false positive for AllMyLinks
2025-10-02 19:36:43 -04:00
shreyasNaik0101 d314d75db1 fix(sites): Remediate false positive for Mydramalist 2025-10-03 04:43:05 +05:30
shreyasNaik0101 c89a52caf7 fix(sites): Remediate false positive for AllMyLinks 2025-10-03 04:25:46 +05:30
Paul Pfeister 9c18cfe273
Merge pull request #2563 from sherlock-project/chore/update-co
chore: update code owners
2025-10-02 18:25:59 -04:00
shreyasNaik0101 779d4c33f4 fix: Remove username_unclaimed as requested 2025-10-03 03:55:03 +05:30
Paul Pfeister 072c24687b
Merge pull request #2558 from hanjm-github/master
feat: Add some popular website in Korea
2025-10-02 18:22:42 -04:00
Paul Pfeister b811b2bd47
chore: update code owners 2025-10-02 18:21:20 -04:00
shreyasNaik0101 355bfbd328 fix(sites): Remediate false positive for DeviantArt 2025-10-03 00:42:07 +05:30
JongMyeong HAN 7b3632bdad
Add comment to site 'namuwiki'
Co-authored-by: Paul Pfeister <code@pfeister.dev>
2025-10-03 04:00:41 +09:00
Ethan Zhang 4fe41f09ff Removed duplicate Bluesky entry in data.json 2025-10-02 12:42:47 +10:00
JongMyeong HAN cd7c52e4fa
Feat: Add tistory 2025-10-01 00:44:55 +09:00
JongMyeong HAN 86140af50e
feat: Add SOOP 2025-10-01 00:44:02 +09:00
JongMyeong HAN e5cd5e5bfe
feat: Add namuwiki 2025-10-01 00:43:21 +09:00
JongMyeong HAN dc89f1cd27
feat: Add dcinside 2025-10-01 00:41:23 +09:00
Paul Pfeister 388a1e06d4
Merge pull request #2459 from kareemeldahshoury/Issue#2442
Fix Issue #2442: Added support for Aparat
2025-09-20 20:47:37 -04:00
Paul Pfeister 61eeeb7876
Merge branch 'master' into Issue#2442 2025-09-20 20:45:09 -04:00
Paul Pfeister df7da4288c
fix(ci): scoping 2025-09-20 20:44:38 -04:00
Paul Pfeister 70896f1da4
Merge branch 'master' into Issue#2442 2025-09-20 20:26:14 -04:00
Paul Pfeister 0a38cad926
fix(ci): issue write permission 2025-09-20 20:24:41 -04:00
Paul Pfeister 1e38fb6f7b
Merge branch 'master' into Issue#2442 2025-09-20 20:21:48 -04:00
Paul Pfeister 9b3dc3e581
fix(ci): issue write permission 2025-09-20 20:21:28 -04:00
Paul Pfeister 37b30602fd
Merge branch 'master' into Issue#2442 2025-09-20 20:12:21 -04:00
Paul Pfeister 7afdee4c58
fix: incorrect method 2025-09-20 20:09:44 -04:00
Paul Pfeister d4d8e01e31
chore: remove dead site
Fixes: #2433
2025-09-20 19:45:34 -04:00
Paul Pfeister e5e0da00fe
Merge pull request #2549 from sherlock-project/add/instapaper
feat: add instapaper
2025-09-20 18:13:30 -04:00
Paul Pfeister dc61cdc7a4
chore: set request method 2025-09-20 18:10:33 -04:00
Paul Pfeister 0fa2e1afc7
chore: cleanup everything 2025-09-20 18:09:44 -04:00
Paul Pfeister 7ca90ba728
ci: test result summarization 2025-09-20 18:06:25 -04:00
Paul Pfeister cd6fa5bb30
ci: fix the thing 2025-09-20 18:04:42 -04:00
Paul Pfeister fa05641661
ci: improve validation 2025-09-20 17:43:00 -04:00
Paul Pfeister 97ba4e8616
fix(ci): validation issue 2025-09-20 15:39:01 -04:00
Paul Pfeister 9882478fb5
feat: add instapaper 2025-09-20 15:05:44 -04:00
Paul Pfeister 9f5b7e1846
fix(validation ci): parsing and presentation 2025-09-20 15:02:43 -04:00
Paul Pfeister 05afac7082
Merge pull request #2548 from sherlock-project/feature/automatic-testing
Automatically test modified targets upon PR
2025-09-20 14:47:38 -04:00
Paul Pfeister ae362b0f02
ci: automatically validate modified targets on pr 2025-09-20 14:44:19 -04:00
Paul Pfeister 435540606e
chore: add typedef 2025-09-20 13:49:29 -04:00
Paul Pfeister 96aa12c140
Merge pull request #2546 from rezocrypt/add-laracast-support
Added Laracast support
2025-09-20 13:38:21 -04:00
My Name 9560355a7c Added Laracast support 2025-09-18 10:23:09 +04:00
Paul Pfeister b44ac231c1
chore: move SSOT to pyproject.toml
Co-authored-by: ByteXenon <125568681+ByteXenon@users.noreply.github.com>
2025-09-17 17:47:45 -04:00
Paul Pfeister 7ff3924f0b
ci(exclusions): ensure unstaging and removal of tmp 2025-09-17 17:17:49 -04:00
Paul Pfeister 39c3729524
ci(exclusions): fix loss of untracked list 2025-09-17 14:09:15 -04:00
Paul Pfeister faddcbd15f
ci(exclusions): fix loss of untracked list 2025-09-17 14:03:51 -04:00
Paul Pfeister 78a2d309d1
ci(exclusions): fix loss of untracked list 2025-09-17 13:55:42 -04:00
Paul Pfeister 35940e7584
fix: ignore exclusions list on parameterization for false positive tests 2025-09-17 13:44:02 -04:00
Paul Pfeister 524415b5d5
chore: bump to 0.16.0 2025-09-15 22:03:23 -04:00
Paul Pfeister 8882310450
feat: honor automatic exclusions list 2025-09-15 21:56:54 -04:00
Paul Pfeister 6d15f1319e
ci: fix exclusions updater (again) 2025-09-15 21:29:20 -04:00
Paul Pfeister 69d3308c71
ci: fix exclusions updater 2025-09-15 21:24:10 -04:00
Paul Pfeister 5c57b20936
ci: fix exclusions updater 2025-09-15 21:17:09 -04:00
Paul Pfeister e09319f29f
Merge pull request #2536 from sherlock-project/feature/username_fuzz
Return support for F+/F- detection via fuzzing
2025-09-15 21:05:35 -04:00
Paul Pfeister b15242881e
ci: automatically update exclusions list 2025-09-15 21:03:17 -04:00
Paul Pfeister e02507e5a1
test: set upper bound on f+ fuzz 2025-09-15 20:31:26 -04:00
Paul Pfeister 284662e156
Merge pull request #2501 from Davis-3450/add/platzi
add: platzi.com
2025-09-14 17:24:49 -04:00
Davis 1b9f823cef
Merge branch 'master' into add/platzi 2025-09-14 16:12:09 -05:00
Moshi f0f37d841c bugfix: update platzi
- "username_claimed" is now set to "freddier" (the most popular user, just in case)
- error code and request method are now explicit.
- added trailing slash for consistency
2025-09-14 16:03:32 -05:00
Pierre-Yves Lapersonne 58b20db9f1
feat: add `outgress.com` in supported web sites (#2426)
Closes #2426

Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info>
2025-09-14 02:21:33 -04:00
Pierre-Yves Lapersonne a98a113a4b
feat: add `opencollective.com` in supported web sites (#2430)
Closes #2430

Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info>
2025-09-14 02:21:32 -04:00
Pierre-Yves Lapersonne 164d01d163
feat: add `linuxfr.org` in supported web sites (#2427)
Closes #2427

Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info>
2025-09-14 02:21:32 -04:00
Pierre-Yves Lapersonne ddd94474b8
feat: add `pixelfed.social` in supported web sites (#2425)
Closes #2425

Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info>
2025-09-14 02:21:32 -04:00
Pierre-Yves Lapersonne 541b023b7f
feat: add `mamot.fr` in supported web sites (#2424)
Closes #2424

Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info>
2025-09-14 02:21:32 -04:00
Pierre-Yves Lapersonne 9b502d9245
feat: new targets (9), minor cleanup
Closes #2421 (added support for site)
Closes #2422 (added support for site)
Closes #2423 (added support for site)
Closes #2424 (added support for site)
Closes #2425 (added support for site)
Closes #2426 (added support for site)
Closes #2427 (added support for site)
Closes #2429 (added support for site)
Closes #2430 (added support for site)

Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info>
Singed-off-by: Paul Pfeister <code@pfeister.dev>
2025-09-14 02:16:16 -04:00
Pierre-Yves Lapersonne b9c352fb7c
style: clean file by removing useless whitespace
Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info>
2025-09-14 02:14:53 -04:00
Pierre-Yves Lapersonne 48ef668e1e
feat: add `write.as` in supported web sites (#2422)
Closes #2422

Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info>
2025-09-14 02:14:53 -04:00
Pierre-Yves Lapersonne 481c39ace3
feat: add `speakerdeck.com` in supported web sites
Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info>
2025-09-14 02:14:52 -04:00
Pierre-Yves Lapersonne 6b9305250d
feat: add `framapiaf.org` in supported web sites
Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info>
2025-09-14 02:14:52 -04:00
Pierre-Yves Lapersonne 87bd15f927
style: remove useless empty line
Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info>
2025-09-14 02:14:52 -04:00
Paul Pfeister db23ae933f
Merge pull request #2417 from jasontenpenny/master
Sort Bluesky Alphabetically
2025-09-14 01:42:14 -04:00
Paul Pfeister ad76b3685f
chore: simplify test names 2025-09-14 01:39:37 -04:00
Paul Pfeister 34cb23bc6e
test: itemize f+/f- 2025-09-14 01:36:21 -04:00
Paul Pfeister 702bfee988
chore: deprecate 3.8, 3.9 2025-09-14 01:10:52 -04:00
Paul Pfeister dfe8b1599d
test: prepare false negative detection base 2025-09-14 00:57:55 -04:00
Paul Pfeister ca094d8264
test: prepare false positive detection base 2025-09-14 00:39:35 -04:00
Paul Pfeister 5113dcfb36
Merge pull request #2493 from MR-VL/master
Fix Issue #2492 and #2494 | Added support for Pychess | Remove TorrentGalaxy
2025-09-13 21:21:00 -04:00
MR-VL d3f4c65459
remove trailing comma for cashapp breaking TOX 2025-09-13 18:15:23 -05:00
MR-VL 2504f238e5
Merge branch 'master' into master 2025-09-13 17:39:57 -05:00
Paul Pfeister 9646055560
fix(manifest): schema non-compliance 2025-09-13 18:30:08 -04:00
Paul Pfeister 80d4abae34
Merge pull request #2446 from S1lvus/master
Add support for CashApp
2025-09-13 18:25:55 -04:00
Paul Pfeister 19ae05d68a
Merge pull request #2460 from rskbansal/fix_minecraft
Fixed Minecraft
2025-09-13 18:08:53 -04:00
Paul Pfeister 5c62b2ab1b
Merge pull request #2483 from MaxwellOldshein/feat/add-playstrategy-support
Fix Issue #2475: Added support for Playstrategy
2025-09-13 18:06:26 -04:00
Paul Pfeister 6cc4d9e0c7
Merge branch 'master' into feat/add-playstrategy-support 2025-09-13 18:04:18 -04:00
Paul Pfeister 1ddfc08d7d
Merge pull request #2484 from MaxwellOldshein/feat/add-blitz-tactics-support
Fix Issue #2474: Added support for Blitz Tactics
2025-09-13 18:02:12 -04:00
Paul Pfeister cca68bb9ab
Merge pull request #2513 from akamayu-ouo/site/plurk
Add support for Plurk
2025-09-13 17:49:31 -04:00
Paul Pfeister d6db0f7d79
Merge #2516 (Tumblr) 2025-09-13 17:48:19 -04:00
Craig London d60562130c readmefixes: Fix typo 2025-09-13 17:27:25 -04:00
Craig London aa1945b017 readmefixes: HTML fixes 2025-09-13 17:27:25 -04:00
[Tulsi Shetty] dafcaec192 feat: Tumblr added 2025-08-16 18:42:19 +05:30
akamayu-ouo 3c9eda75e9
Add Plurk 2025-08-09 14:08:56 +00:00
Moshi 8635d68864 add: platzi.com 2025-07-16 14:48:16 -05:00
MR-VL 6e7b3cecb8 Syntax Fixes in README.md
remove unmatched closing </a> tag, and fix indent
add quotes to <p align=center> LN:1 to make it valid HTML
fix type from programmaticaly to programmatically
2025-07-06 18:02:24 -05:00
MR-VL 1e12c3f7a6 Remove TorrentGalaxy 2025-07-06 16:35:31 -05:00
MR-VL 9e40e0a0f4 Add support for Pychess 2025-07-06 16:27:59 -05:00
manjushsh 4706323976 data: add hive blog 2025-06-27 20:05:01 +05:30
manjushsh 4721c7f553 data: Add code sandboxio 2025-06-27 19:42:23 +05:30
Maxwell Oldshein c82c00650a Add Blitz Tactics support 2025-06-26 15:26:57 -04:00
Maxwell Oldshein 9e54e68da5 Add Playstrategy support 2025-06-26 15:10:44 -04:00
Matheus Felipe 4423230c11
Fix stray character typo in README (#2462) 2025-05-06 06:55:10 -03:00
Kotaro Fujii a04fbe6ccc Fix stray character typo in README 2025-05-06 18:29:52 +09:00
Rhythm Bansal f599ae5ff1
fixed minecraft 2025-04-30 02:23:06 +05:30
kareemeldahshoury de81f38622 Fix Issue #2442: Added support for Aparat 2025-04-29 15:25:31 -05:00
Siddharth Dushantha a40944d336
Merge pull request #2435 from apify/actorize
Apify will sponsor your project: Sherlock Actor on Apify infrastructure
2025-04-23 09:48:52 +02:00
S1lvus e0f184f263
Removed extra spaces 2025-04-07 20:31:17 -04:00
S1lvus 6c1623a3ad
Added CashApp into the site list.
This adds username search for the CashApp financial platform.
2025-04-07 20:28:28 -04:00
Matheus Felipe 4428b15162
Add star history section in README.md (#2438) 2025-03-20 21:53:09 -03:00
Matheus Felipe 2adc96833a
docs: add star history section in README.md
This is to have the section listed in the GitHub index
2025-03-20 21:50:08 -03:00
Adam Kliment b7ce20b2ca feat: Actor definition, Actor usage to README 2025-03-20 11:54:57 +01:00
Jason Tenpenny 5e3828882e
Sort Bluesky Alphabetically
moved the Bluesky config to its proper alphabetical location so it can be found easier
2025-03-02 22:40:32 -06:00
Paul Pfeister 78cba6b7ca
chore: add info regarding experienced problem 2025-02-17 01:07:07 -05:00
Paul Pfeister 9be92b9834
Merge pull request #2404 from MR-VL/master
Remove ask.fm + fix instagram  middleware
2025-02-17 00:13:21 -05:00
Paul Pfeister 53cbd332ca
chore: cleanup dead targets 2025-02-17 00:07:18 -05:00
Paul Pfeister 2ff2836159
chore: document unusual behavior
Target will fail to provide a response for unclaimed queries rather than an error
2025-02-16 23:54:29 -05:00
Paul Pfeister 0d008b109e
fix: v0 f+ 2025-02-16 23:33:04 -05:00
MR-VL a29faa8288
Fix instagram 2025-02-04 21:37:13 -06:00
MR-VL 809f8ba6c4
Remove askfm 2025-02-04 21:26:04 -06:00
Paul Pfeister 1912cbdea4
fix: manifest encoding error on Windows 2025-02-03 03:38:17 -05:00
Paul Pfeister b1fb7ac2ff
chore: add PR note to --json help 2025-02-03 03:02:22 -05:00
Paul Pfeister b5726e5edf
Merge pull request #2286 from sk337/patch-1
add v0.dev to data.json
2025-02-03 03:01:06 -05:00
Paul Pfeister 9eb100c819
Merge branch 'master' into patch-1 2025-02-03 03:00:53 -05:00
Paul Pfeister 86387d0baf
fix: validation error message 2025-02-03 03:00:04 -05:00
Paul Pfeister c6f9e2eac9
fix: validation method typo 2025-02-03 02:54:12 -05:00
Paul Pfeister 73df548532
fix: twitter f+ 2025-02-03 02:53:35 -05:00
Paul Pfeister ae87699824
fix: fiverr/babyru f+ 2025-02-03 02:51:22 -05:00
Paul Pfeister 8568ef7d99
fix: twitch f+ 2025-02-03 02:37:06 -05:00
Paul Pfeister c6f7a99b1c
fix: tldr legal f+ 2025-02-03 02:01:14 -05:00
Paul Pfeister b5bd536e6b
fix: slideshare f+ 2025-02-03 01:55:46 -05:00
Paul Pfeister d029af3e89
fix: shpock f+ 2025-02-03 01:51:54 -05:00
Paul Pfeister af2bb98901
fix: producthunt f+
Co-authored-by: Regan Bell <reganbell@gmail.com>
2025-02-03 00:51:04 -05:00
Paul Pfeister 68c4edf8b6
fix: giphy f+ 2025-02-03 00:40:05 -05:00
Paul Pfeister 300d6eda21
fix: airpilot f+
Seems to be defunct.
2025-02-03 00:35:58 -05:00
Paul Pfeister 33b567d453
fix: 8tracks f+ 2025-02-03 00:35:12 -05:00
Paul Pfeister c779d21c13
Merge pull request #2292 from Pasanlaksitha/master
Add Hugging Face data.json
2025-02-03 00:26:17 -05:00
Paul Pfeister d818c5ebf2
Merge pull request #2312 from SOGeKING-NUL/master
Adding DigitalSpy
2025-02-03 00:22:26 -05:00
Paul Pfeister 072b581f98
Merge pull request #2394 from ibnaleem/bluesky
Add Bluesky Support
2025-02-02 23:48:46 -05:00
Paul Pfeister 2de353d8d6
fix: over-restrictive fail cond 2025-02-02 23:45:27 -05:00
Paul Pfeister ca2f19ae52
fix: EOF 2025-02-02 23:43:50 -05:00
Paul Pfeister b8bdfd8601
Merge pull request #2381 from brantonb/master
Add support for omg.lol
2025-02-02 23:42:42 -05:00
Paul Pfeister a985a0891e
Merge pull request #2383 from brantonb/strava
Fix Strava false positives
2025-02-02 23:38:30 -05:00
Paul Pfeister a688e268b3
Merge pull request #2393 from joeyagreco/joey/fix-pypi-url
Fix PyPi URL
2025-02-02 23:32:43 -05:00
Paul Pfeister 3a7384e5f1
fix: pypi user-friendly url 2025-02-02 23:27:57 -05:00
ibnaleem ca17c39172
add bsky.app 2025-01-20 15:02:43 +00:00
joeyagreco 55f0628c2b fixed pypi url 2025-01-19 14:39:45 -05:00
joeyagreco 276167be9c Revert "fixed pypi url"
This reverts commit d87f4f2b60.
2025-01-19 14:37:36 -05:00
joeyagreco d87f4f2b60 fixed pypi url 2025-01-19 14:35:08 -05:00
Branton Boehm 1684fbf866 Fix Strava false positives 2024-12-25 18:21:44 -08:00
Branton Boehm c0c5d829e2 Add support for omg.lol 2024-12-25 12:34:41 -08:00
Utsav Jana 0a0e4fe606 add the regexCheck for DigitalSpy 2024-12-06 19:16:53 +05:30
Utsav Jana 979f17cf3b add the regexCheck for DigitalSpy 2024-12-06 19:14:39 +05:30
Utsav Jana fe6e2e57c3 Adding DigitalSpy 2024-12-06 19:13:25 +05:30
Paul Pfeister 2c303a2869
fix: WAF hit list 2024-11-13 16:53:59 -05:00
Paul Pfeister 0f395d037b
fix: F+s 2024-11-13 16:53:40 -05:00
Paul Pfeister 839eab1384
chore: add cloudfront waf hit 2024-11-11 22:25:47 -05:00
Paul Pfeister 98fbd525ee
Fix #2355
(Regression introduced by #1520)
2024-11-11 20:33:27 -05:00
Paul Pfeister 046c2957f3
chore: expand WAF hit list 2024-11-11 20:05:20 -05:00
Paul Pfeister 18bae485ae
Merge pull request #2287 from ntexe/master
fix #2242
2024-11-11 20:03:17 -05:00
Paul Pfeister 46023a86b6
Merge pull request #2285 from rsb-23/master
Fixed false positives #2273
2024-11-11 19:48:37 -05:00
Paul Pfeister 6f3b89c98a
Merge branch 'master' into master 2024-11-11 19:45:32 -05:00
Paul Pfeister 0b7d925b50
fix: F-/F+ Generic 2024-11-11 19:44:14 -05:00
Paul Pfeister 785346c12d
Merge pull request #2277 from sherlock-project/2275-PEP-561
Comply with PEP 561
2024-11-11 17:19:19 -05:00
Paul Pfeister a998ec309c
fix: missing Optional typing import 2024-11-11 17:16:31 -05:00
Paul Pfeister 557394dc56
Merge pull request #2278 from sherlock-project/adjust-readme
docs: update readme for fedora, parrot, 24.04
2024-11-11 17:10:12 -05:00
Paul Pfeister 5990cf1e8e
docs: cleanup install ref 2024-11-11 17:06:46 -05:00
Paul Pfeister cf393b8fec
Merge pull request #2339 from bytexenon/archive_org_fix
fix: add additional error message check for archive.org downtime
2024-11-11 16:55:50 -05:00
Paul Pfeister 662d80e1a6
Merge pull request #2356 from bytexenon/pr-option
Overload `--json` to fetch via PR number
2024-11-11 16:47:07 -05:00
ByteXenon 270fbf6473 Overload `--json` to accept pull request data and remove `--pull-request` parameter 2024-11-06 00:26:14 -07:00
Paul Pfeister 06b062c122
Update test to use still-present target 2024-11-04 20:49:22 -05:00
Paul Pfeister 6fa603981d
Remove bodybuilding[.]com forum
Forum defunct. Not added to removed list as the site will no longer
exist.
2024-11-04 20:46:03 -05:00
Paul Pfeister 8f5d601758
Merge pull request #2267 from sherlock-project/2266-deprecate-support-for-python-38
Deprecate Python 3.8
2024-11-04 20:33:42 -05:00
Paul Pfeister 08aad5a755
Merge pull request #2357 from bytexenon/version-issue-templates
Add "Package version" field to issue templates
2024-11-04 18:30:39 -05:00
ByteXenon 3ffb514f71 Make 'version' field required in Bug Report template 2024-11-04 04:49:57 -07:00
ByteXenon 24f64b3e32 Add version number field to Bug Report issue template 2024-11-04 04:46:59 -07:00
ByteXenon e84c5fce37 Add `--pull-request` [`-pr`] parameter 2024-11-04 02:22:05 -07:00
Paul Pfeister e94e00af53
Revert "Merge pull request #2340 from mikebgrep/master"
This reverts commit 185478cf8e, reversing
changes made to 3804fd9a91.

Some patterns seem to be incorrect
2024-11-01 20:33:39 -04:00
Paul Pfeister 185478cf8e
Merge pull request #2340 from mikebgrep/master
Fix Invalid usernames for number of pages
2024-11-01 20:24:27 -04:00
Paul Pfeister 98d8120ccd
Merge branch 'master' into master 2024-11-01 20:21:31 -04:00
Paul Pfeister 3804fd9a91
Merge pull request #2335 from rskbansal/master
Fixed Twitter(X)
2024-11-01 19:57:37 -04:00
Paul Pfeister bd46baa639
fix: 8tracks
Use username availability endpoint from regflow with predictable
response language (en-us).

Referenced by #2318
Fixes #2332
Closes #2333 (removes target rather than fixes)
2024-11-01 19:44:48 -04:00
Paul Pfeister c64e795447
Merge pull request #2291 from Suramyavns/master
fixed speedrun site support #2288
2024-11-01 05:02:35 -04:00
Paul Pfeister 0e5769154c
Merge pull request #2323 from Nuung/master
Add support for velog
2024-11-01 04:50:06 -04:00
Paul Pfeister d4b57510f1
Merge pull request #2328 from yuzicodes/add-rarible
Add site rarible.com
2024-11-01 04:42:54 -04:00
Aalim Sheikh b06fb4e425
Update sherlock_project/resources/data.json
Co-authored-by: Paul Pfeister <code@pfeister.dev>
2024-11-01 14:09:04 +05:30
Paul Pfeister 1c2e99a5b3
Merge pull request #2325 from anujatappeta/corrected-function-calling
commiting to correct function call
2024-11-01 04:33:58 -04:00
Paul Pfeister 43e543acae
Merge pull request #2326 from alokranjan609/librarything-detection-fix
Change the errorMsg for Librarything
2024-11-01 04:29:58 -04:00
Paul Pfeister 3f1f2534a3
Update tests/sherlock_interactives.py 2024-11-01 04:29:23 -04:00
Paul Pfeister 821062bb81
Merge pull request #2310 from nktkhndlwl/vlr-support
add VLR.gg support
2024-11-01 04:05:39 -04:00
Paul Pfeister 7cd9f2acb0
Merge pull request #2283 from gtkacz/2248-exophase_support
Adding support for exophase via `data.json` for #2248
2024-11-01 04:00:47 -04:00
Paul Pfeister 7b7a0d2c8e
Merge pull request #2316 from MR-VL/master
Add support for Atcoder
2024-11-01 03:39:28 -04:00
Paul Pfeister f50d0e6c41
Merge pull request #2349 from NOMADE55/add-topcoder-bgg-small-ecommerce
Add BoardGameGeek and Small Ecommerce Platforms
2024-11-01 03:22:33 -04:00
Lucas Terracino bbe9e93164 Add support for Tiendanube 2024-10-22 14:38:09 -03:00
Lucas Terracino beb57d2e49 Add support for Empretienda 2024-10-22 14:38:01 -03:00
Lucas Terracino a03aa3157f Add support for BoardGameGeek 2024-10-22 14:28:14 -03:00
Lucas Terracino 4deba5f147 Add support for Topcoder 2024-10-22 14:11:51 -03:00
Bytexenon af4c08a08b
fix: remove "Other" from archive.org downtime message check 2024-10-20 07:03:08 -07:00
mikebgrep deb1936027 add regexCheck for the pages that does not have related to the issue with long not valid username 2024-10-20 16:12:37 +03:00
Bytexenon fb52343aa3
fix: add additional error message check for archive.org downtime 2024-10-20 05:27:10 -07:00
Rhythm Bansal fdf3655e63
fixed `urlProbe` for Twitter by adding an alternative endpoint 2024-10-17 22:18:52 +05:30
Aalim Sheikh d83e7c1652
Add site rarible.com 2024-10-11 20:35:29 +05:30
Alok 8e0c7eff17 Change the errorMsg for Librarything 2024-10-11 11:02:57 +05:30
anuja b7406919dc commiting to correct function call 2024-10-11 07:53:59 +05:30
nktkhndlwl 656abbbbf8 fix errorType 2024-10-10 20:46:54 +05:30
Nuung ef751d34f2 modify: velog's username_claimed value 2024-10-09 19:11:51 +09:00
Nuung 4ef9e6b0de modify: I forgot to adding @ 2024-10-09 18:52:36 +09:00
Nuung ecd59455b0 Add support for velog 2024-10-09 18:31:06 +09:00
MR-VL 15e6924338
Add support for Atcoder 2024-10-05 21:47:45 -05:00
Utsav Jana ad86a8b954 Adding DigitalSpy 2024-10-02 22:25:26 +05:30
Niket Khandelwal 61fdb6e206 add VLR.gg support 2024-10-02 00:50:33 +05:30
Pallavi Kathait 193de54b6d
Update site-list.py
These changes improve readability and maintain the functionality of the original code.
2024-09-29 21:31:19 +05:30
Pasan Laksitha b6c33d2901 Add Hugging Face platform details to data.json for Sherlock project. 2024-09-11 10:16:25 +05:30
suramyavns b65b03fe63 removed duplication 2024-09-11 07:49:29 +05:30
suramyavns 5193ab8a97 fixed speedrun site support 2024-09-10 14:18:46 +05:30
sk337 84965712f6
Update data.json 2024-09-05 13:09:41 -07:00
sk337 5f0d55bcfa
fix requested changes 2024-09-05 13:07:13 -07:00
ntexe 277d19816e fix #2242 2024-09-05 16:03:08 +03:00
sk337 a7b370bc3d
add unclaimed username to v0.dev 2024-09-04 13:11:30 -07:00
sk337 efd765eba7
Update data.json 2024-09-04 13:06:22 -07:00
rsb-23 192e2c333e Fixed false positives #2273
- Updated user-agent in header and removed duplicate
-
2024-09-03 21:04:10 +05:30
gtkacz 89b4cec3cb Adding support for exophase via `data.json` for #2248 2024-09-02 15:16:57 -03:00
Paul Pfeister 4660afb7d8
Fix implicit optional (PEP 484)
Co-authored-by: GuardianWang <31812793+GuardianWang@users.noreply.github.com>
2024-08-30 01:21:08 -04:00
Paul Pfeister e9eb7d32ce
docs: update readme for fedora, parrot, 24.04 2024-08-28 00:25:55 -04:00
ntexe f7075e1b64 Restore Fanpop
Cleaned up commit for sherlock-project/sherlock#2269
2024-08-27 23:08:40 -04:00
Paul Pfeister f32fdaa93a
Merge pull request #2272 from PeterDaveHello/patch-1
Remove not needed `apt-get update` in Dockerfile
2024-08-27 22:51:59 -04:00
Paul Pfeister 1c8e3f8142
Merge branch 'add-kaskus' 2024-08-27 22:45:01 -04:00
Paul Pfeister 298161114b
fix: indentation 2024-08-27 22:44:50 -04:00
Paul Pfeister 0d0335bca0
Comply with PEP 561 2024-08-27 22:32:48 -04:00
paperbenni 1e2e380876 add site Gitea.com 2024-08-27 19:37:35 +02:00
L0mbart bceb625984
Update data.json
add kaskus site
2024-08-27 13:17:00 +07:00
Peter Dave Hello a5dda7ae91
Remove not needed `apt-get update` in Dockerfile
There's no need to wait and waste the time and bandwidth to wait for `apt-get update` for `pip3 install` ;)
2024-08-27 02:37:23 +08:00
Paul Pfeister 9e111a334b
Merge pull request #2259 from Txbias/codeforce-support
Remove Codeforces from removed-sites.md
2024-08-23 01:37:43 -04:00
Paul Pfeister 74a3576132
Merge remote-tracking branch 'ntexe/master' 2024-08-23 01:31:41 -04:00
Paul Pfeister 0646063509
fix: file not found 2024-08-23 01:21:49 -04:00
Paul Pfeister c6c1f3eef7
Deprecate Python 3.8
Python 3.8 is nearing EOL, and it's being deprecated here to allow for
more ready dependency resolution between pandas and numpy, avoiding a
fatal import. Resolves #2266.
2024-08-23 01:15:47 -04:00
Siddharth Dushantha 47ab466d85
Merge pull request #2265 from MR-VL/master
remove ICQ.com
2024-08-22 16:04:01 -04:00
MR-VL 378967c2a5
remove ICQ 2024-08-21 15:10:04 -05:00
ntexe 2cc854bd6b You can now disable creation of a txt file 2024-08-21 14:01:22 +03:00
Txbias 4d83f057ac Removed Codeforces from the removed-sites.md file 2024-08-15 16:27:55 +02:00
Siddharth Dushantha 573ae6c488
Merge pull request #2254 from sherlock-project/fix-sync-json-data
updated sherlock path
2024-08-11 06:09:43 -04:00
Siddharth Dushantha fce4347a3c updated sherlock path 2024-08-11 12:07:39 +02:00
Siddharth Dushantha 7b2076c113
Merge pull request #2225 from Netail/feat/threads
feat: add Threads
2024-08-11 06:03:00 -04:00
Maikel 7e18e0eb4c
Merge branch 'sherlock-project:master' into feat/threads 2024-08-08 00:11:47 +02:00
Siddharth Dushantha 22100ceed3 fix merge conflict 2024-08-07 17:31:39 +02:00
Siddharth Dushantha 40102be04a
Merge pull request #2239 from sherlock-project/rm-old-dir
removed sherlock dir
2024-08-07 17:28:39 +02:00
Siddharth Dushantha 201ab43631 removed old sherlock dir 2024-08-04 13:41:48 +02:00
Maikel defd1740b8
Merge branch 'sherlock-project:master' into feat/threads 2024-08-02 09:43:58 +02:00
Netail 4544ddc219 fix: use Sec-Fetch-Mode 2024-07-29 21:37:41 +02:00
Paul Pfeister 7e87a88d71
chore: discord check via unauthed reg flow check 2024-07-28 00:14:57 -04:00
Maikel db4bb5ada6
fix: feedback 2024-07-22 19:37:00 +02:00
Paul Pfeister 09b324f7d4
chore: deactivate alik.cz 2024-07-22 00:33:31 -04:00
Paul Pfeister 35773d43da
Accomodate legacy client version checks 2024-07-09 20:15:52 -04:00
Netail eeda506990 feat: add Threads 2024-07-08 20:26:55 +02:00
Paul Pfeister 2016892e64
Remove torrequest dep
Not sure why it's not in my patch file, but I was removing via sed in my spec instead.
2024-06-28 23:39:38 -04:00
Paul Pfeister 44ad8f506a
Lint 2024-06-28 23:38:44 -04:00
Siddharth Dushantha cfa4097df9 removed support for tor 2024-06-26 21:57:11 +02:00
33 changed files with 1902 additions and 3404 deletions

19
.actor/Dockerfile Normal file
View File

@ -0,0 +1,19 @@
FROM sherlock/sherlock as sherlock
# Install Node.js
RUN apt-get update; apt-get install curl gpg -y
RUN mkdir -p /etc/apt/keyrings
RUN curl -fsSL https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key | gpg --dearmor -o /etc/apt/keyrings/nodesource.gpg
RUN echo "deb [signed-by=/etc/apt/keyrings/nodesource.gpg] https://deb.nodesource.com/node_20.x nodistro main" | tee /etc/apt/sources.list.d/nodesource.list
RUN apt-get update && apt-get install -y curl bash git jq jo xz-utils nodejs
# Install Apify CLI (node.js) for the Actor Runtime
RUN npm -g install apify-cli
# Install Dependencies for the Actor Shell Script
RUN apt-get update && apt-get install -y bash jq jo xz-utils nodejs
# Copy Actor dir with the actorization shell script
COPY .actor/ .actor
ENTRYPOINT [".actor/actor.sh"]

93
.actor/README.md Normal file
View File

@ -0,0 +1,93 @@
# Sherlock Actor on Apify
[![Sherlock Actor](https://apify.com/actor-badge?actor=netmilk/sherlock)](https://apify.com/netmilk/sherlock?fpr=sherlock)
This Actor wraps the [Sherlock Project](https://sherlockproject.xyz/) to provide serverless username reconnaissance across social networks in the cloud. It helps you find usernames across multiple social media platforms without installing and running the tool locally.
## What are Actors?
[Actors](https://docs.apify.com/platform/actors?fpr=sherlock) are serverless microservices running on the [Apify Platform](https://apify.com/?fpr=sherlock). They are based on the [Actor SDK](https://docs.apify.com/sdk/js?fpr=sherlock) and can be found in the [Apify Store](https://apify.com/store?fpr=sherlock). Learn more about Actors in the [Apify Whitepaper](https://whitepaper.actor?fpr=sherlock).
## Usage
### Apify Console
1. Go to the Apify Actor page
2. Click "Run"
3. In the input form, fill in **Username(s)** to search for
4. The Actor will run and produce its outputs in the default datastore
### Apify CLI
```bash
apify call YOUR_USERNAME/sherlock --input='{
"usernames": ["johndoe", "janedoe"]
}'
```
### Using Apify API
```bash
curl --request POST \
--url "https://api.apify.com/v2/acts/YOUR_USERNAME~sherlock/run" \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_API_TOKEN' \
--data '{
"usernames": ["johndoe", "janedoe"],
}
}'
```
## Input Parameters
The Actor accepts a JSON schema with the following structure:
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `usernames` | array | Yes | - | List of usernames to search for |
| `usernames[]` | string | Yes | "json" | Username to search for |
### Example Input
```json
{
"usernames": ["techuser", "designuser"],
}
```
## Output
The Actor provides three types of outputs:
### Dataset Record*
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `username` | string | Yes | Username the search was conducted for |
| `links` | array | Yes | Array with found links to the social media |
| `links[]`| string | No | URL to the account
### Example Dataset Item (JSON)
```json
{
"username": "johndoe",
"links": [
"https://github.com/johndoe"
]
}
```
## Performance & Resources
- **Memory Requirements**:
- Minimum: 512 MB RAM
- Recommended: 1 GB RAM for multiple usernames
- **Processing Time**:
- Single username: ~1-2 minutes
- Multiple usernames: 2-5 minutes
- Varies based on number of sites checked and response times
For more help, check the [Sherlock Project documentation](https://github.com/sherlock-project/sherlock) or raise an issue in the Actor's repository.

13
.actor/actor.json Normal file
View File

@ -0,0 +1,13 @@
{
"actorSpecification": 1,
"name": "sherlock",
"version": "0.0",
"buildTag": "latest",
"environmentVariables": {},
"dockerFile": "./Dockerfile",
"dockerContext": "../",
"input": "./input_schema.json",
"storages": {
"dataset": "./dataset_schema.json"
}
}

14
.actor/actor.sh Executable file
View File

@ -0,0 +1,14 @@
#!/bin/bash
INPUT=`apify actor:get-input | jq -r .usernames[] | xargs echo`
echo "INPUT: $INPUT"
sherlock $INPUT
for username in $INPUT; do
# escape the special meaning leading characters
# https://github.com/jpmens/jo/blob/master/jo.md#description
safe_username=$(echo $username | sed 's/^@/\\@/' | sed 's/^:/\\:/' | sed 's/%/\\%/')
echo "pushing results for username: $username, content:"
cat $username.txt
sed '$d' $username.txt | jo -a | jo username=$safe_username links:=- | apify actor:push-data
done

View File

@ -0,0 +1,45 @@
{
"actorSpecification": 1,
"fields":{
"title": "Sherlock actor input",
"description": "This is actor input schema",
"type": "object",
"schemaVersion": 1,
"properties": {
"links": {
"title": "Links to accounts",
"type": "array",
"description": "A list of social media accounts found for the uername"
},
"username": {
"title": "Lookup username",
"type": "string",
"description": "Username the lookup was performed for"
}
},
"required": [
"username",
"links"
]
},
"views": {
"overview": {
"title": "Overview",
"transformation": {
"fields": [
"username",
"links"
],
},
"display": {
"component": "table",
"links": {
"label": "Links"
},
"username":{
"label": "Username"
}
}
}
}
}

18
.actor/input_schema.json Normal file
View File

@ -0,0 +1,18 @@
{
"title": "Sherlock actor input",
"description": "This is actor input schema",
"type": "object",
"schemaVersion": 1,
"properties": {
"usernames": {
"title": "Usernames to hunt down",
"type": "array",
"description": "A list of usernames to be checked for existence across social media",
"editor": "stringList",
"prefill": ["johndoe"]
}
},
"required": [
"usernames"
]
}

2
.github/CODEOWNERS vendored
View File

@ -1,5 +1,5 @@
### REPOSITORY
/.github/CODEOWNERS @sdushantha
/.github/CODEOWNERS @sdushantha @ppfeister
/.github/FUNDING.yml @sdushantha
/LICENSE @sdushantha

View File

@ -19,6 +19,15 @@ body:
- Other (indicate below)
validations:
required: true
- type: input
id: package-version
attributes:
label: Package version
description: |
Knowing the version of the package you are using can help us diagnose your issue more quickly.
You can find the version by running `sherlock --version`.
validations:
required: true
- type: textarea
id: description
attributes:

89
.github/workflows/exclusions.yml vendored Normal file
View File

@ -0,0 +1,89 @@
name: Exclusions Updater
on:
schedule:
#- cron: '0 5 * * 0' # Runs at 05:00 every Sunday
- cron: '0 5 * * *' # Runs at 05:00 every day
workflow_dispatch:
jobs:
update-exclusions:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v5
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.13'
- name: Install Poetry
uses: abatilo/actions-poetry@v4
with:
poetry-version: 'latest'
- name: Install dependencies
run: |
poetry install --no-interaction --with dev
- name: Run false positive tests
run: |
$(poetry env activate)
pytest -q --tb no -m validate_targets_fp -n 20 | tee fp_test_results.txt
deactivate
- name: Parse false positive detections by desired categories
run: |
grep -oP '(?<=test_false_pos\[)[^\]]+(?=\].*result was Claimed)' fp_test_results.txt \
| sort -u > false_positive_exclusions.txt
grep -oP '(?<=test_false_pos\[)[^\]]+(?=\].*result was WAF)' fp_test_results.txt \
| sort -u > waf_hits.txt
- name: Detect if exclusions list changed
id: detect_changes
run: |
git fetch origin exclusions || true
if git show origin/exclusions:false_positive_exclusions.txt >/dev/null 2>&1; then
# If the exclusions branch and file exist, compare
if git diff --quiet origin/exclusions -- false_positive_exclusions.txt; then
echo "exclusions_changed=false" >> "$GITHUB_OUTPUT"
else
echo "exclusions_changed=true" >> "$GITHUB_OUTPUT"
fi
else
# If the exclusions branch or file do not exist, treat as changed
echo "exclusions_changed=true" >> "$GITHUB_OUTPUT"
fi
- name: Quantify and display results
run: |
FP_COUNT=$(wc -l < false_positive_exclusions.txt | xargs)
WAF_COUNT=$(wc -l < waf_hits.txt | xargs)
echo ">>> Found $FP_COUNT false positives and $WAF_COUNT WAF hits."
echo ">>> False positive exclusions:" && cat false_positive_exclusions.txt
echo ">>> WAF hits:" && cat waf_hits.txt
- name: Commit and push exclusions list
if: steps.detect_changes.outputs.exclusions_changed == 'true'
run: |
git config user.name "Paul Pfeister (automation)"
git config user.email "code@pfeister.dev"
mv false_positive_exclusions.txt false_positive_exclusions.txt.tmp
git add -f false_positive_exclusions.txt.tmp # -f required to override .gitignore
git stash push -m "stash false positive exclusion list" -- false_positive_exclusions.txt.tmp
git fetch origin exclusions || true # Allows creation of branch if deleted
git checkout -B exclusions origin/exclusions || (git checkout --orphan exclusions && git rm -rf .)
git stash pop || true
mv false_positive_exclusions.txt.tmp false_positive_exclusions.txt
git rm -f false_positive_exclusions.txt.tmp || true
git add false_positive_exclusions.txt
git commit -m "auto: update exclusions list" || echo "No changes to commit"
git push origin exclusions

View File

@ -11,6 +11,7 @@ on:
- '**/*.py'
- '**/*.ini'
- '**/*.toml'
- 'Dockerfile'
push:
branches:
- master
@ -21,11 +22,13 @@ on:
- '**/*.py'
- '**/*.ini'
- '**/*.toml'
- 'Dockerfile'
jobs:
tox-lint:
# Linting is ran through tox to ensure that the same linter is used by local runners
runs-on: ubuntu-latest
# Linting is ran through tox to ensure that the same linter
# is used by local runners
steps:
- uses: actions/checkout@v4
- name: Set up linting environment
@ -41,7 +44,8 @@ jobs:
tox-matrix:
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false # We want to know what specicic versions it fails on
# We want to know what specicic versions it fails on
fail-fast: false
matrix:
os: [
ubuntu-latest,
@ -49,11 +53,10 @@ jobs:
macos-latest,
]
python-version: [
'3.8',
'3.9',
'3.10',
'3.11',
'3.12',
'3.13',
]
steps:
- uses: actions/checkout@v4
@ -68,3 +71,22 @@ jobs:
pip install tox-gh-actions
- name: Run tox
run: tox
docker-build-test:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Get version from pyproject.toml
id: get-version
run: |
VERSION=$(grep -m1 'version = ' pyproject.toml | cut -d'"' -f2)
echo "version=$VERSION" >> $GITHUB_OUTPUT
- name: Build Docker image
run: |
docker build \
--build-arg VERSION_TAG=${{ steps.get-version.outputs.version }} \
-t sherlock-test:latest .
- name: Test Docker image runs
run: docker run --rm sherlock-test:latest --version

View File

@ -0,0 +1,126 @@
name: Modified Target Validation
on:
pull_request_target:
branches:
- master
paths:
- "sherlock_project/resources/data.json"
jobs:
validate-modified-targets:
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
# Checkout the base branch but fetch all history to avoid a second fetch call
ref: ${{ github.base_ref }}
fetch-depth: 0
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: "3.13"
- name: Install Poetry
uses: abatilo/actions-poetry@v4
with:
poetry-version: "latest"
- name: Install dependencies
run: |
poetry install --no-interaction --with dev
- name: Prepare JSON versions for comparison
run: |
# Fetch only the PR's branch head (single network call in this step)
git fetch origin pull/${{ github.event.pull_request.number }}/head:pr
# Find the merge-base commit between the target branch and the PR branch
MERGE_BASE=$(git merge-base origin/${{ github.base_ref }} pr)
echo "Comparing PR head against merge-base commit: $MERGE_BASE"
# Safely extract the file from the PR's head and the merge-base commit
git show pr:sherlock_project/resources/data.json > data.json.head
git show $MERGE_BASE:sherlock_project/resources/data.json > data.json.base
# CRITICAL FIX: Overwrite the checked-out data.json with the one from the PR
# This ensures that pytest runs against the new, updated file.
cp data.json.head sherlock_project/resources/data.json
- name: Discover modified targets
id: discover-modified
run: |
CHANGED=$(
python - <<'EOF'
import json
import sys
try:
with open("data.json.base") as f: base = json.load(f)
with open("data.json.head") as f: head = json.load(f)
except FileNotFoundError as e:
print(f"Error: Could not find {e.filename}", file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError as e:
print(f"Error: Could not decode JSON from a file - {e}", file=sys.stderr)
sys.exit(1)
changed = []
for k, v in head.items():
if k not in base or base[k] != v:
changed.append(k)
print(",".join(sorted(changed)))
EOF
)
# Preserve changelist
echo -e ">>> Changed targets: \n$(echo $CHANGED | tr ',' '\n')"
echo "changed_targets=$CHANGED" >> "$GITHUB_OUTPUT"
- name: Validate remote manifest against local schema
if: steps.discover-modified.outputs.changed_targets != ''
run: |
poetry run pytest tests/test_manifest.py::test_validate_manifest_against_local_schema
# --- The rest of the steps below are unchanged ---
- name: Validate modified targets
if: steps.discover-modified.outputs.changed_targets != ''
continue-on-error: true
run: |
poetry run pytest -q --tb no -rA -m validate_targets -n 20 \
--chunked-sites "${{ steps.discover-modified.outputs.changed_targets }}" \
--junitxml=validation_results.xml
- name: Prepare validation summary
if: steps.discover-modified.outputs.changed_targets != ''
id: prepare-summary
run: |
summary=$(
poetry run python devel/summarize_site_validation.py validation_results.xml || echo "Failed to generate summary of test results"
)
echo "$summary" > validation_summary.md
- name: Announce validation results
if: steps.discover-modified.outputs.changed_targets != ''
uses: actions/github-script@v8
with:
script: |
const fs = require('fs');
const body = fs.readFileSync('validation_summary.md', 'utf8');
await github.rest.issues.createComment({
issue_number: context.payload.pull_request.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: body,
});
- name: This step shows as ran when no modifications are found
if: steps.discover-modified.outputs.changed_targets == ''
run: |
echo "No modified targets found"

View File

@ -2,13 +2,12 @@
# 1. Update the version tag in the Dockerfile to match the version in sherlock/__init__.py
# 2. Update the VCS_REF tag to match the tagged version's FULL commit hash
# 3. Build image with BOTH latest and version tags
# i.e. `docker build -t sherlock/sherlock:0.15.0 -t sherlock/sherlock:latest .`
# i.e. `docker build -t sherlock/sherlock:0.16.0 -t sherlock/sherlock:latest .`
FROM python:3.12-slim-bullseye as build
FROM python:3.12-slim-bullseye AS build
WORKDIR /sherlock
RUN apt-get update \
pip3 install --no-cache-dir --upgrade pip
RUN pip3 install --no-cache-dir --upgrade pip
FROM python:3.12-slim-bullseye
WORKDIR /sherlock

View File

@ -1,36 +1,45 @@
#!/usr/bin/env python
# This module generates the listing of supported sites which can be found in
# sites.md. It also organizes all the sites in alphanumeric order
# sites.mdx. It also organizes all the sites in alphanumeric order
import json
import os
DATA_REL_URI: str = "sherlock_project/resources/data.json"
DEFAULT_ENCODING = "utf-8"
# Read the data.json file
with open("sherlock/resources/data.json", "r", encoding="utf-8") as data_file:
with open(DATA_REL_URI, "r", encoding=DEFAULT_ENCODING) as data_file:
data: dict = json.load(data_file)
# Removes schema-specific keywords for proper processing
social_networks: dict = dict(data)
social_networks = data.copy()
social_networks.pop('$schema', None)
# Sort the social networks in alphanumeric order
social_networks: list = sorted(social_networks.items())
social_networks = sorted(social_networks.items())
# Make output dir where the site list will be written
os.mkdir("output")
# Write the list of supported sites to sites.md
with open("output/sites.mdx", "w") as site_file:
site_file.write("---\ntitle: 'List of supported sites'\nsidebarTitle: 'Supported sites'\nicon: 'globe'\ndescription: 'Sherlock currently supports **400+** sites'\n---\n\n")
# Write the list of supported sites to sites.mdx
with open("output/sites.mdx", "w", encoding=DEFAULT_ENCODING) as site_file:
site_file.write("---\n")
site_file.write("title: 'List of supported sites'\n")
site_file.write("sidebarTitle: 'Supported sites'\n")
site_file.write("icon: 'globe'\n")
site_file.write("description: 'Sherlock currently supports **400+** sites'\n")
site_file.write("---\n\n")
for social_network, info in social_networks:
url_main = info["urlMain"]
is_nsfw = "**(NSFW)**" if info.get("isNSFW") else ""
site_file.write(f"1. [{social_network}]({url_main}) {is_nsfw}\n")
# Overwrite the data.json file with sorted data
with open("sherlock/resources/data.json", "w") as data_file:
with open(DATA_REL_URI, "w", encoding=DEFAULT_ENCODING) as data_file:
sorted_data = json.dumps(data, indent=2, sort_keys=True)
data_file.write(sorted_data)
data_file.write("\n")
data_file.write("\n") # Keep the newline after writing data
print("Finished updating supported site listing!")

View File

@ -0,0 +1,72 @@
#!/usr/bin/env python
# This module summarizes the results of site validation tests queued by
# workflow validate_modified_targets for presentation in Issue comments.
from defusedxml import ElementTree as ET
import sys
from pathlib import Path
def summarize_junit_xml(xml_path: Path) -> str:
tree = ET.parse(xml_path)
root = tree.getroot()
suite = root.find('testsuite')
pass_message: str = ":heavy_check_mark: &nbsp; Pass"
fail_message: str = ":x: &nbsp; Fail"
if suite is None:
raise ValueError("Invalid JUnit XML: No testsuite found")
summary_lines: list[str] = []
summary_lines.append("#### Automatic validation of changes\n")
summary_lines.append("| Target | F+ Check | F- Check |")
summary_lines.append("|---|---|---|")
failures = int(suite.get('failures', 0))
errors_detected: bool = False
results: dict[str, dict[str, str]] = {}
for testcase in suite.findall('testcase'):
test_name = testcase.get('name').split('[')[0]
site_name = testcase.get('name').split('[')[1].rstrip(']')
failure = testcase.find('failure')
error = testcase.find('error')
if site_name not in results:
results[site_name] = {}
if test_name == "test_false_neg":
results[site_name]['F- Check'] = pass_message if failure is None and error is None else fail_message
elif test_name == "test_false_pos":
results[site_name]['F+ Check'] = pass_message if failure is None and error is None else fail_message
if error is not None:
errors_detected = True
for result in results:
summary_lines.append(f"| {result} | {results[result].get('F+ Check', 'Error!')} | {results[result].get('F- Check', 'Error!')} |")
if failures > 0:
summary_lines.append("\n___\n" +
"\nFailures were detected on at least one updated target. Commits containing accuracy failures" +
" will often not be merged (unless a rationale is provided, such as false negatives due to regional differences).")
if errors_detected:
summary_lines.append("\n___\n" +
"\n**Errors were detected during validation. Please review the workflow logs.**")
return "\n".join(summary_lines)
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: summarize_site_validation.py <junit-xml-file>")
sys.exit(1)
xml_path: Path = Path(sys.argv[1])
if not xml_path.is_file():
print(f"Error: File '{xml_path}' does not exist.")
sys.exit(1)
summary: str = summarize_junit_xml(xml_path)
print(summary)

View File

@ -1,6 +1,6 @@
<p align=center>
<p align="center">
<br>
<a href="https://sherlock-project.github.io/" target="_blank"><img src="images/sherlock-logo.png"/></a>
<a href="https://sherlock-project.github.io/" target="_blank"><img src="images/sherlock-logo.png" alt="sherlock"/></a>
<br>
<span>Hunt down social media accounts by username across <a href="https://sherlockproject.xyz/sites">400+ social networks</a></span>
<br>
@ -15,25 +15,27 @@
</p>
<p align="center">
<img width="70%" height="70%" src="images/demo.png"/>
</a>
<img width="70%" height="70%" src="images/demo.png" alt="demo"/>
</p>
## Installation
> [!WARNING]
> Packages for ParrotOS and Ubuntu 24.04, maintained by a third party, appear to be __broken__.
> Users of these systems should defer to pipx/pip or Docker.
| | Command | Notes |
| - | - | - |
| PyPI | `pipx install sherlock-project` | `pip` may be used in place of `pipx` |
| Docker | `docker pull sherlock/sherlock` | |
| Debian family | `apt install sherlock` | Kali, Parrot, Debian Testing and Sid |
| BlackArch | `pacman -S sherlock` | |
| Homebrew | `brew install sherlock` | |
| Method | Notes |
| - | - |
| `pipx install sherlock-project` | `pip` may be used in place of `pipx` |
| `docker run -it --rm sherlock/sherlock` |
| `dnf install sherlock-project` | |
Community-maintained packages are available for Debian (>= 13), Ubuntu (>= 22.10), Homebrew, Kali, and BlackArch. These packages are not directly supported or maintained by the Sherlock Project.
See all alternative installation methods [here](https://sherlockproject.xyz/installation)
## Usage
## General usage
To search for only one user:
```bash
@ -95,15 +97,35 @@ optional arguments:
--local, -l Force the use of the local data.json file.
--nsfw Include checking of NSFW sites from default list.
```
## Apify Actor Usage [![Sherlock Actor](https://apify.com/actor-badge?actor=netmilk/sherlock)](https://apify.com/netmilk/sherlock?fpr=sherlock)
<a href="https://apify.com/netmilk/sherlock?fpr=sherlock"><img src="https://apify.com/ext/run-on-apify.png" alt="Run Sherlock Actor on Apify" width="176" height="39" /></a>
You can run Sherlock in the cloud without installation using the [Sherlock Actor](https://apify.com/netmilk/sherlock?fpr=sherlock) on [Apify](https://apify.com?fpr=sherlock) free of charge.
``` bash
$ echo '{"usernames":["user123"]}' | apify call -so netmilk/sherlock
[{
"username": "user123",
"links": [
"https://www.1337x.to/user/user123/",
...
]
}]
```
Read more about the [Sherlock Actor](../.actor/README.md), including how to use it programmatically via the Apify [API](https://apify.com/netmilk/sherlock/api?fpr=sherlock), [CLI](https://docs.apify.com/cli/?fpr=sherlock) and [JS/TS and Python SDKs](https://docs.apify.com/sdk?fpr=sherlock).
## Credits
Thank you to everyone who has contributed to Sherlock! ❤️
<a href="https://github.com/sherlock-project/sherlock/graphs/contributors">
<img src="https://contrib.rocks/image?&columns=25&max=10000&&repo=sherlock-project/sherlock" noZoom />
<img src="https://contrib.rocks/image?&columns=25&max=10000&&repo=sherlock-project/sherlock" alt="contributors"/>
</a>
## Star History
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=sherlock-project/sherlock&type=Date&theme=dark" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=sherlock-project/sherlock&type=Date" />

View File

@ -84,22 +84,6 @@ As of 2020-02-23, all usernames are reported as not existing.
},
```
## Fanpop
As of 2020-02-23, all usernames are reported as not existing.
```json
"fanpop": {
"errorType": "response_url",
"errorUrl": "http://www.fanpop.com/",
"rank": 9454,
"url": "http://www.fanpop.com/fans/{}",
"urlMain": "http://www.fanpop.com/",
"username_claimed": "blue",
"username_unclaimed": "noonewould_everusethis7"
},
```
## Canva
As of 2020-02-23, all usernames are reported as not existing.
@ -1273,19 +1257,6 @@ As of 2022-05-1, FanCentro returns false positives. Will later in new version of
},
```
## Codeforces
As og 2022-05-01, Codeforces returns false positives
```json
"Codeforces": {
"errorType": "response_url",
"errorUrl": "https://codeforces.com/",
"url": "https://codeforces.com/profile/{}",
"urlMain": "https://www.codeforces.com/",
"username_claimed": "tourist",
"username_unclaimed": "noonewouldeverusethis789"
},
```
## Smashcast
As og 2022-05-01, Smashcast is down
```json
@ -1919,3 +1890,108 @@ __2024-06-10 :__ Http request returns 403 forbidden, and tries to verify the con
"username_claimed": "JennyKrafts"
}
```
## Alik.cz
__2024-07-21 :__ Target is now BLACKLISTED from the default manifest due to the site recieving unnecessarily high traffic from Sherlock (by request of the site owners). This target is not permitted to be reactivited. Inclusion in unrelated manifests is not impacted, but it is discouraged.
## 8tracks
__2025-02-02 :__ Might be dead again. Nobody knows for sure.
```json
"8tracks": {
"errorType": "message",
"errorMsg": "\"available\":true",
"headers": {
"Accept-Language": "en-US,en;q=0.5"
},
"url": "https://8tracks.com/{}",
"urlProbe": "https://8tracks.com/users/check_username?login={}&format=jsonh",
"urlMain": "https://8tracks.com/",
"username_claimed": "blue"
}
```
## Shpock
__2025-02-02 :__ Can likely be added back with a new endpoint (source username availability endpoint from mobile app reg flow?)
```json
"Shpock": {
"errorType": "status_code",
"url": "https://www.shpock.com/shop/{}/items",
"urlMain": "https://www.shpock.com/",
"username_claimed": "user"
}
```
## Twitch
__2025-02-02 :__
```json
"Twitch": {
"errorType": "message",
"errorMsg": "components.availability-tracking.warn-unavailable.component",
"url": "https://www.twitch.tv/{}",
"urlMain": "https://www.twitch.tv/",
"urlProbe": "https://m.twitch.tv/{}",
"username_claimed": "jenny"
}
```
## Fiverr
__2025-02-02 :__ Fiverr added CSRF protections that messed with this test
```json
"Fiverr": {
"errorMsg": "\"status\":\"success\"",
"errorType": "message",
"headers": {
"Content-Type": "application/json",
"Accept-Language": "en-US,en;q=0.9"
},
"regexCheck": "^[A-Za-z][A-Za-z\\d_]{5,14}$",
"request_method": "POST",
"request_payload": {
"username": "{}"
},
"url": "https://www.fiverr.com/{}",
"urlMain": "https://www.fiverr.com/",
"urlProbe": "https://www.fiverr.com/validate_username",
"username_claimed": "blueman"
}
```
## BabyRU
__2025-02-02 :__ Just being problematic (possibly related to errorMsg encoding?)
```json
"babyRU": {
"errorMsg": [
"\u0421\u0442\u0440\u0430\u043d\u0438\u0446\u0430, \u043a\u043e\u0442\u043e\u0440\u0443\u044e \u0432\u044b \u0438\u0441\u043a\u0430\u043b\u0438, \u043d\u0435 \u043d\u0430\u0439\u0434\u0435\u043d\u0430",
"Доступ с вашего IP-адреса временно ограничен"
],
"errorType": "message",
"url": "https://www.baby.ru/u/{}/",
"urlMain": "https://www.baby.ru/",
"username_claimed": "blue"
}
```
## v0.dev
__2025-02-16 :__ Unsure if any way to view profiles exists now
```json
"v0.dev": {
"errorType": "message",
"errorMsg": "<title>v0 by Vercel</title>",
"url": "https://v0.dev/{}",
"urlMain": "https://v0.dev",
"username_claimed": "t3dotgg"
}
```
## TorrentGalaxy
__2025-07-06 :__ Site appears to have gone offline in March and hasn't come back
```json
"TorrentGalaxy": {
"errorMsg": "<title>TGx:Can't show details</title>",
"errorType": "message",
"regexCheck": "^[A-Za-z0-9]{3,15}$",
"url": "https://torrentgalaxy.to/profile/{}",
"urlMain": "https://torrentgalaxy.to/",
"username_claimed": "GalaxyRG"
},
```

View File

@ -8,8 +8,7 @@ source = "init"
[tool.poetry]
name = "sherlock-project"
# single source of truth for version is __init__.py
version = "0"
version = "0.16.0"
description = "Hunt down social media accounts by username across social networks"
license = "MIT"
authors = [
@ -30,6 +29,10 @@ classifiers = [
"Natural Language :: English",
"Operating System :: OS Independent",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Topic :: Security"
]
homepage = "https://sherlockproject.xyz/"
@ -40,23 +43,26 @@ repository = "https://github.com/sherlock-project/sherlock"
"Bug Tracker" = "https://github.com/sherlock-project/sherlock/issues"
[tool.poetry.dependencies]
python = "^3.8"
python = "^3.9"
certifi = ">=2019.6.16"
colorama = "^0.4.1"
PySocks = "^1.7.0"
requests = "^2.22.0"
requests-futures = "^1.0.0"
stem = "^1.8.0"
torrequest = "^0.1.0"
# pandas can likely be bumped up to ^2.0.0 after fc39 EOL
pandas = ">=1.0.0,<3.0.0"
pandas = "^2.2.1"
openpyxl = "^3.0.10"
[tool.poetry.extras]
tor = ["torrequest"]
tomli = "^2.2.1"
[tool.poetry.group.dev.dependencies]
jsonschema = "^4.0.0"
rstr = "^3.2.2"
pytest = "^8.4.2"
pytest-xdist = "^3.8.0"
[tool.poetry.group.ci.dependencies]
defusedxml = "^0.7.1"
[tool.poetry.scripts]
sherlock = 'sherlock_project.sherlock:main'

View File

@ -1,4 +1,7 @@
[pytest]
addopts = --strict-markers
addopts = --strict-markers -m "not validate_targets"
markers =
online: mark tests are requiring internet access.
validate_targets: mark tests for sweeping manifest validation (sends many requests).
validate_targets_fp: validate_targets, false positive tests only.
validate_targets_fn: validate_targets, false negative tests only.

File diff suppressed because it is too large Load Diff

View File

@ -1,80 +0,0 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "Sherlock Target Manifest",
"description": "Social media targets to probe for the existence of known usernames",
"type": "object",
"properties": {
"$schema": { "type": "string" }
},
"patternProperties": {
"^(?!\\$).*?$": {
"type": "object",
"description": "Target name and associated information (key should be human readable name)",
"required": [ "url", "urlMain", "errorType", "username_claimed" ],
"properties": {
"url": { "type": "string" },
"urlMain": { "type": "string" },
"urlProbe": { "type": "string" },
"username_claimed": { "type": "string" },
"regexCheck": { "type": "string" },
"isNSFW": { "type": "boolean" },
"headers": { "type": "object" },
"request_payload": { "type": "object" },
"__comment__": {
"type": "string",
"description": "Used to clarify important target information if (and only if) a commit message would not suffice.\nThis key should not be parsed anywhere within Sherlock."
},
"tags": {
"oneOf": [
{ "$ref": "#/$defs/tag" },
{ "type": "array", "items": { "$ref": "#/$defs/tag" } }
]
},
"request_method": {
"type": "string",
"enum": [ "GET", "POST", "HEAD", "PUT" ]
},
"errorType": {
"type": "string",
"enum": [ "message", "response_url", "status_code" ]
},
"errorMsg": {
"oneOf": [
{ "type": "string" },
{ "type": "array", "items": { "type": "string" } }
]
},
"errorCode": {
"oneOf": [
{ "type": "integer" },
{ "type": "array", "items": { "type": "integer" } }
]
},
"errorUrl": { "type": "string" },
"response_url": { "type": "string" }
},
"dependencies": {
"errorMsg": {
"properties" : { "errorType": { "const": "message" } }
},
"errorUrl": {
"properties": { "errorType": { "const": "response_url" } }
},
"errorCode": {
"properties": { "errorType": { "const": "status_code" } }
}
},
"if": { "properties": { "errorType": { "const": "message" } } },
"then": { "required": [ "errorMsg" ] },
"else": {
"if": { "properties": { "errorType": { "const": "response_url" } } },
"then": { "required": [ "errorUrl" ] }
},
"additionalProperties": false
}
},
"additionalProperties": false,
"$defs": {
"tag": { "type": "string", "enum": [ "adult", "gaming" ] }
}
}

View File

@ -5,11 +5,26 @@ networks.
"""
from importlib.metadata import version as pkg_version, PackageNotFoundError
import pathlib
import tomli
def get_version() -> str:
"""Fetch the version number of the installed package."""
try:
return pkg_version("sherlock_project")
except PackageNotFoundError:
pyproject_path: pathlib.Path = pathlib.Path(__file__).resolve().parent.parent / "pyproject.toml"
with pyproject_path.open("rb") as f:
pyproject_data = tomli.load(f)
return pyproject_data["tool"]["poetry"]["version"]
# This variable is only used to check for ImportErrors induced by users running as script rather than as module or package
import_error_test_var = None
__shortname__ = "Sherlock"
__longname__ = "Sherlock: Find Usernames Across Social Networks"
__version__ = "0.15.0"
__version__ = get_version()
forge_api_latest_release = "https://api.github.com/repos/sherlock-project/sherlock/releases/latest"

View File

@ -14,8 +14,8 @@ if __name__ == "__main__":
# Check if the user is using the correct version of Python
python_version = sys.version.split()[0]
if sys.version_info < (3, 8):
print(f"Sherlock requires Python 3.8+\nYou are using Python {python_version}, which is not supported by Sherlock.")
if sys.version_info < (3, 9):
print(f"Sherlock requires Python 3.9+\nYou are using Python {python_version}, which is not supported by Sherlock.")
sys.exit(1)
from sherlock_project import sherlock

View File

File diff suppressed because it is too large Load Diff

View File

@ -10,7 +10,7 @@
"^(?!\\$).*?$": {
"type": "object",
"description": "Target name and associated information (key should be human readable name)",
"required": [ "url", "urlMain", "errorType", "username_claimed" ],
"required": ["url", "urlMain", "errorType", "username_claimed"],
"properties": {
"url": { "type": "string" },
"urlMain": { "type": "string" },
@ -32,11 +32,22 @@
},
"request_method": {
"type": "string",
"enum": [ "GET", "POST", "HEAD", "PUT" ]
"enum": ["GET", "POST", "HEAD", "PUT"]
},
"errorType": {
"oneOf": [
{
"type": "string",
"enum": [ "message", "response_url", "status_code" ]
"enum": ["message", "response_url", "status_code"]
},
{
"type": "array",
"items": {
"type": "string",
"enum": ["message", "response_url", "status_code"]
}
}
]
},
"errorMsg": {
"oneOf": [
@ -55,26 +66,84 @@
},
"dependencies": {
"errorMsg": {
"properties" : { "errorType": { "const": "message" } }
"oneOf": [
{ "properties": { "errorType": { "const": "message" } } },
{
"properties": {
"errorType": {
"type": "array",
"contains": { "const": "message" }
}
}
}
]
},
"errorUrl": {
"properties": { "errorType": { "const": "response_url" } }
"oneOf": [
{ "properties": { "errorType": { "const": "response_url" } } },
{
"properties": {
"errorType": {
"type": "array",
"contains": { "const": "response_url" }
}
}
}
]
},
"errorCode": {
"properties": { "errorType": { "const": "status_code" } }
"oneOf": [
{ "properties": { "errorType": { "const": "status_code" } } },
{
"properties": {
"errorType": {
"type": "array",
"contains": { "const": "status_code" }
}
}
}
]
}
},
"if": { "properties": { "errorType": { "const": "message" } } },
"then": { "required": [ "errorMsg" ] },
"else": {
"if": { "properties": { "errorType": { "const": "response_url" } } },
"then": { "required": [ "errorUrl" ] }
"allOf": [
{
"if": {
"anyOf": [
{ "properties": { "errorType": { "const": "message" } } },
{
"properties": {
"errorType": {
"type": "array",
"contains": { "const": "message" }
}
}
}
]
},
"then": { "required": ["errorMsg"] }
},
{
"if": {
"anyOf": [
{ "properties": { "errorType": { "const": "response_url" } } },
{
"properties": {
"errorType": {
"type": "array",
"contains": { "const": "response_url" }
}
}
}
]
},
"then": { "required": ["errorUrl"] }
}
],
"additionalProperties": false
}
},
"additionalProperties": false,
"$defs": {
"tag": { "type": "string", "enum": [ "adult", "gaming" ] }
"tag": { "type": "string", "enum": ["adult", "gaming"] }
}
}

View File

@ -24,6 +24,7 @@ import re
from argparse import ArgumentParser, RawDescriptionHelpFormatter
from json import loads as json_loads
from time import monotonic
from typing import Optional
import requests
from requests_futures.sessions import FuturesSession
@ -167,15 +168,13 @@ def multiple_usernames(username):
def sherlock(
username,
site_data,
username: str,
site_data: dict[str, dict[str, str]],
query_notify: QueryNotify,
tor: bool = False,
unique_tor: bool = False,
dump_response: bool = False,
proxy=None,
timeout=60,
):
proxy: Optional[str] = None,
timeout: int = 60,
) -> dict[str, dict[str, str | QueryResult]]:
"""Run Sherlock Analysis.
Checks for existence of username on various social media sites.
@ -187,8 +186,6 @@ def sherlock(
query_notify -- Object with base type of QueryNotify().
This will be used to notify the caller about
query results.
tor -- Boolean indicating whether to use a tor circuit for the requests.
unique_tor -- Boolean indicating whether to use a new tor circuit for each request.
proxy -- String indicating the proxy URL
timeout -- Time in seconds to wait before timing out request.
Default is 60 seconds.
@ -209,32 +206,9 @@ def sherlock(
# Notify caller that we are starting the query.
query_notify.start(username)
# Create session based on request methodology
if tor or unique_tor:
try:
from torrequest import TorRequest # noqa: E402
except ImportError:
print("Important!")
print("> --tor and --unique-tor are now DEPRECATED, and may be removed in a future release of Sherlock.")
print("> If you've installed Sherlock via pip, you can include the optional dependency via `pip install 'sherlock-project[tor]'`.")
print("> Other packages should refer to their documentation, or install it separately with `pip install torrequest`.\n")
sys.exit(query_notify.finish())
print("Important!")
print("> --tor and --unique-tor are now DEPRECATED, and may be removed in a future release of Sherlock.")
# Requests using Tor obfuscation
try:
underlying_request = TorRequest()
except OSError:
print("Tor not found in system path. Unable to continue.\n")
sys.exit(query_notify.finish())
underlying_session = underlying_request.session
else:
# Normal requests
underlying_session = requests.session()
underlying_request = requests.Request()
# Limit number of workers to 20.
# This is probably vastly overkill.
@ -261,7 +235,7 @@ def sherlock(
# A user agent is needed because some sites don't return the correct
# information since they think that we are bots (Which we actually are...)
headers = {
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/116.0",
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:129.0) Gecko/20100101 Firefox/129.0",
}
if "headers" in net_info:
@ -358,15 +332,10 @@ def sherlock(
# Store future in data for access later
net_info["request_future"] = future
# Reset identify for tor (if needed)
if unique_tor:
underlying_request.reset_identity()
# Add this site's results into final dictionary with all the other results.
results_total[social_network] = results_site
# Open the file containing account links
# Core logic: If tor requests, make them here. If multi-threaded requests, wait for responses
for social_network, net_info in site_data.items():
# Retrieve results again
results_site = results_total.get(social_network)
@ -380,6 +349,8 @@ def sherlock(
# Get the expected error type
error_type = net_info["errorType"]
if isinstance(error_type, str):
error_type: list[str] = [error_type]
# Retrieve future and ensure it has finished
future = net_info["request_future"]
@ -412,8 +383,10 @@ def sherlock(
# be highly targetted. Comment at the end of each fingerprint to
# indicate target and date fingerprinted.
WAFHitMsgs = [
'.loading-spinner{visibility:hidden}body.no-js .challenge-running{display:none}body.dark{background-color:#222;color:#d9d9d9}body.dark a{color:#fff}body.dark a:hover{color:#ee730a;text-decoration:underline}body.dark .lds-ring div{border-color:#999 transparent transparent}body.dark .font-red{color:#b20f03}body.dark', # 2024-05-13 Cloudflare
'{return l.onPageView}}),Object.defineProperty(r,"perimeterxIdentifiers",{enumerable:' # 2024-04-09 PerimeterX / Human Security
r'.loading-spinner{visibility:hidden}body.no-js .challenge-running{display:none}body.dark{background-color:#222;color:#d9d9d9}body.dark a{color:#fff}body.dark a:hover{color:#ee730a;text-decoration:underline}body.dark .lds-ring div{border-color:#999 transparent transparent}body.dark .font-red{color:#b20f03}body.dark', # 2024-05-13 Cloudflare
r'<span id="challenge-error-text">', # 2024-11-11 Cloudflare error page
r'AwsWafIntegration.forceRefreshToken', # 2024-11-11 Cloudfront (AWS)
r'{return l.onPageView}}),Object.defineProperty(r,"perimeterxIdentifiers",{enumerable:' # 2024-04-09 PerimeterX / Human Security
]
if error_text is not None:
@ -422,7 +395,12 @@ def sherlock(
elif any(hitMsg in r.text for hitMsg in WAFHitMsgs):
query_status = QueryStatus.WAF
elif error_type == "message":
else:
if any(errtype not in ["message", "status_code", "response_url"] for errtype in error_type):
error_context = f"Unknown error type '{error_type}' for {social_network}"
query_status = QueryStatus.UNKNOWN
else:
if "message" in error_type:
# error_flag True denotes no error found in the HTML
# error_flag False denotes error found in the HTML
error_flag = True
@ -447,7 +425,8 @@ def sherlock(
query_status = QueryStatus.CLAIMED
else:
query_status = QueryStatus.AVAILABLE
elif error_type == "status_code":
if "status_code" in error_type and query_status is not QueryStatus.AVAILABLE:
error_codes = net_info.get("errorCode")
query_status = QueryStatus.CLAIMED
@ -459,7 +438,8 @@ def sherlock(
query_status = QueryStatus.AVAILABLE
elif r.status_code >= 300 or r.status_code < 200:
query_status = QueryStatus.AVAILABLE
elif error_type == "response_url":
if "response_url" in error_type and query_status is not QueryStatus.AVAILABLE:
# For this detection method, we have turned off the redirect.
# So, there is no need to check the response URL: it will always
# match the request. Instead, we will ensure that the response
@ -469,11 +449,6 @@ def sherlock(
query_status = QueryStatus.CLAIMED
else:
query_status = QueryStatus.AVAILABLE
else:
# It should be impossible to ever get here...
raise ValueError(
f"Unknown Error Type '{error_type}' for " f"site '{social_network}'"
)
if dump_response:
print("+++++++++++++++++++++")
@ -504,7 +479,7 @@ def sherlock(
print("+++++++++++++++++++++")
# Notify caller about results of query.
result = QueryResult(
result: QueryResult = QueryResult(
username=username,
site_name=social_network,
site_url_user=url,
@ -593,22 +568,6 @@ def main():
dest="output",
help="If using single username, the output of the result will be saved to this file.",
)
parser.add_argument(
"--tor",
"-t",
action="store_true",
dest="tor",
default=False,
help="Make requests over Tor; increases runtime; requires Tor to be installed and in system path.",
)
parser.add_argument(
"--unique-tor",
"-u",
action="store_true",
dest="unique_tor",
default=False,
help="Make requests over Tor with new Tor circuit after each request; increases runtime; requires Tor to be installed and in system path.",
)
parser.add_argument(
"--csv",
action="store_true",
@ -653,7 +612,7 @@ def main():
metavar="JSON_FILE",
dest="json_file",
default=None,
help="Load data from a JSON file or an online, valid, JSON file.",
help="Load data from a JSON file or an online, valid, JSON file. Upstream PR numbers also accepted.",
)
parser.add_argument(
"--timeout",
@ -716,6 +675,32 @@ def main():
help="Include checking of NSFW sites from default list.",
)
# TODO deprecated in favor of --txt, retained for workflow compatibility, to be removed
# in future release
parser.add_argument(
"--no-txt",
action="store_true",
dest="no_txt",
default=False,
help="Disable creation of a txt file - WILL BE DEPRECATED",
)
parser.add_argument(
"--txt",
action="store_true",
dest="output_txt",
default=False,
help="Enable creation of a txt file",
)
parser.add_argument(
"--ignore-exclusions",
action="store_true",
dest="ignore_exclusions",
default=False,
help="Ignore upstream exclusions (may return more false positives)",
)
args = parser.parse_args()
# If the user presses CTRL-C, exit gracefully without throwing errors
@ -723,7 +708,7 @@ def main():
# Check for newer version of Sherlock. If it exists, let the user know about it
try:
latest_release_raw = requests.get(forge_api_latest_release).text
latest_release_raw = requests.get(forge_api_latest_release, timeout=10).text
latest_release_json = json_loads(latest_release_raw)
latest_remote_tag = latest_release_json["tag_name"]
@ -736,22 +721,10 @@ def main():
except Exception as error:
print(f"A problem occurred while checking for an update: {error}")
# Argument check
# TODO regex check on args.proxy
if args.tor and (args.proxy is not None):
raise Exception("Tor and Proxy cannot be set at the same time.")
# Make prompts
if args.proxy is not None:
print("Using the proxy: " + args.proxy)
if args.tor or args.unique_tor:
print("Using Tor to make requests")
print(
"Warning: some websites might refuse connecting over Tor, so note that using this option might increase connection errors."
)
if args.no_color:
# Disable color output.
init(strip=True, convert=False)
@ -773,10 +746,32 @@ def main():
try:
if args.local:
sites = SitesInformation(
os.path.join(os.path.dirname(__file__), "resources/data.json")
os.path.join(os.path.dirname(__file__), "resources/data.json"),
honor_exclusions=False,
)
else:
sites = SitesInformation(args.json_file)
json_file_location = args.json_file
if args.json_file:
# If --json parameter is a number, interpret it as a pull request number
if args.json_file.isnumeric():
pull_number = args.json_file
pull_url = f"https://api.github.com/repos/sherlock-project/sherlock/pulls/{pull_number}"
pull_request_raw = requests.get(pull_url, timeout=10).text
pull_request_json = json_loads(pull_request_raw)
# Check if it's a valid pull request
if "message" in pull_request_json:
print(f"ERROR: Pull request #{pull_number} not found.")
sys.exit(1)
head_commit_sha = pull_request_json["head"]["sha"]
json_file_location = f"https://raw.githubusercontent.com/sherlock-project/sherlock/{head_commit_sha}/sherlock_project/resources/data.json"
sites = SitesInformation(
data_file_path=json_file_location,
honor_exclusions=not args.ignore_exclusions,
do_not_exclude=args.site_list,
)
except Exception as error:
print(f"ERROR: {error}")
sys.exit(1)
@ -830,8 +825,6 @@ def main():
username,
site_data,
query_notify,
tor=args.tor,
unique_tor=args.unique_tor,
dump_response=args.dump_response,
proxy=args.proxy,
timeout=args.timeout,
@ -847,6 +840,7 @@ def main():
else:
result_file = f"{username}.txt"
if args.output_txt:
with open(result_file, "w", encoding="utf-8") as file:
exists_counter = 0
for website_name in results:
@ -931,8 +925,8 @@ def main():
{
"username": usernames,
"name": names,
"url_main": url_main,
"url_user": url_user,
"url_main": [f'=HYPERLINK(\"{u}\")' for u in url_main],
"url_user": [f'=HYPERLINK(\"{u}\")' for u in url_user],
"exists": exists,
"http_status": http_status,
"response_time_s": response_time_s,

View File

@ -7,6 +7,10 @@ import json
import requests
import secrets
MANIFEST_URL = "https://raw.githubusercontent.com/sherlock-project/sherlock/master/sherlock_project/resources/data.json"
EXCLUSIONS_URL = "https://raw.githubusercontent.com/sherlock-project/sherlock/refs/heads/exclusions/false_positive_exclusions.txt"
class SiteInformation:
def __init__(self, name, url_home, url_username_format, username_claimed,
information, is_nsfw, username_unclaimed=secrets.token_urlsafe(10)):
@ -72,7 +76,12 @@ class SiteInformation:
class SitesInformation:
def __init__(self, data_file_path=None):
def __init__(
self,
data_file_path: str|None = None,
honor_exclusions: bool = True,
do_not_exclude: list[str] = [],
):
"""Create Sites Information Object.
Contains information about all supported websites.
@ -110,7 +119,7 @@ class SitesInformation:
# The default data file is the live data.json which is in the GitHub repo. The reason why we are using
# this instead of the local one is so that the user has the most up-to-date data. This prevents
# users from creating issue about false positives which has already been fixed or having outdated data
data_file_path = "https://raw.githubusercontent.com/sherlock-project/sherlock/master/sherlock_project/resources/data.json"
data_file_path = MANIFEST_URL
# Ensure that specified data file has correct extension.
if not data_file_path.lower().endswith(".json"):
@ -120,7 +129,7 @@ class SitesInformation:
if data_file_path.lower().startswith("http"):
# Reference is to a URL.
try:
response = requests.get(url=data_file_path)
response = requests.get(url=data_file_path, timeout=30)
except Exception as error:
raise FileNotFoundError(
f"Problem while attempting to access data file URL '{data_file_path}': {error}"
@ -155,6 +164,28 @@ class SitesInformation:
site_data.pop('$schema', None)
if honor_exclusions:
try:
response = requests.get(url=EXCLUSIONS_URL, timeout=10)
if response.status_code == 200:
exclusions = response.text.splitlines()
exclusions = [exclusion.strip() for exclusion in exclusions]
for site in do_not_exclude:
if site in exclusions:
exclusions.remove(site)
for exclusion in exclusions:
try:
site_data.pop(exclusion, None)
except KeyError:
pass
except Exception:
# If there was any problem loading the exclusions, just continue without them
print("Warning: Could not load exclusions, continuing without them.")
honor_exclusions = False
self.sites = {}
# Add all site information from the json file to internal site list.

View File

@ -4,6 +4,11 @@ import urllib
import pytest
from sherlock_project.sites import SitesInformation
def fetch_local_manifest(honor_exclusions: bool = True) -> dict[str, dict[str, str]]:
sites_obj = SitesInformation(data_file_path=os.path.join(os.path.dirname(__file__), "../sherlock_project/resources/data.json"), honor_exclusions=honor_exclusions)
sites_iterable: dict[str, dict[str, str]] = {site.name: site.information for site in sites_obj}
return sites_iterable
@pytest.fixture()
def sites_obj():
sites_obj = SitesInformation(data_file_path=os.path.join(os.path.dirname(__file__), "../sherlock_project/resources/data.json"))
@ -11,9 +16,7 @@ def sites_obj():
@pytest.fixture(scope="session")
def sites_info():
sites_obj = SitesInformation(data_file_path=os.path.join(os.path.dirname(__file__), "../sherlock_project/resources/data.json"))
sites_iterable = {site.name: site.information for site in sites_obj}
yield sites_iterable
yield fetch_local_manifest()
@pytest.fixture(scope="session")
def remote_schema():
@ -21,3 +24,28 @@ def remote_schema():
with urllib.request.urlopen(schema_url) as remoteschema:
schemadat = json.load(remoteschema)
yield schemadat
def pytest_addoption(parser):
parser.addoption(
"--chunked-sites",
action="store",
default=None,
help="For tests utilizing chunked sites, include only the (comma-separated) site(s) specified.",
)
def pytest_generate_tests(metafunc):
if "chunked_sites" in metafunc.fixturenames:
sites_info = fetch_local_manifest(honor_exclusions=False)
# Ingest and apply site selections
site_filter: str | None = metafunc.config.getoption("--chunked-sites")
if site_filter:
selected_sites: list[str] = [site.strip() for site in site_filter.split(",")]
sites_info = {
site: data for site, data in sites_info.items()
if site in selected_sites
}
params = [{name: data} for name, data in sites_info.items()]
ids = list(sites_info.keys())
metafunc.parametrize("chunked_sites", params, ids=ids)

View File

@ -7,8 +7,8 @@ class Interactives:
def run_cli(args:str = "") -> str:
"""Pass arguments to Sherlock as a normal user on the command line"""
# Adapt for platform differences (Windows likes to be special)
if platform.system == "Windows":
command:str = f"py -m sherlock {args}"
if platform.system() == "Windows":
command:str = f"py -m sherlock_project {args}"
else:
command:str = f"sherlock {args}"
@ -20,8 +20,7 @@ class Interactives:
raise InteractivesSubprocessError(e.output.decode())
# -> list[str] is prefered, but will require deprecation of support for Python 3.8
def walk_sherlock_for_files_with(pattern: str) -> list:
def walk_sherlock_for_files_with(pattern: str) -> list[str]:
"""Check all files within the Sherlock package for matching patterns"""
pattern:re.Pattern = re.compile(pattern)
matching_files:list[str] = []

View File

@ -44,7 +44,7 @@ class TestLiveTargets:
# Known positives should only use sites trusted to be reliable and unchanging
@pytest.mark.parametrize('site,username',[
('BodyBuilding', 'blue'),
('Keybase', 'blue'),
('devRant', 'blue'),
])
def test_known_positives_via_response_url(self, sites_info, site, username):

View File

@ -0,0 +1,100 @@
import pytest
import re
import rstr
from sherlock_project.sherlock import sherlock
from sherlock_project.notify import QueryNotify
from sherlock_project.result import QueryResult, QueryStatus
FALSE_POSITIVE_ATTEMPTS: int = 2 # Since the usernames are randomly generated, it's POSSIBLE that a real username can be hit
FALSE_POSITIVE_QUANTIFIER_UPPER_BOUND: int = 15 # If a pattern uses quantifiers such as `+` `*` or `{n,}`, limit the upper bound (0 to disable)
FALSE_POSITIVE_DEFAULT_PATTERN: str = r'^[a-zA-Z0-9]{7,20}$' # Used in absence of a regexCheck entry
def set_pattern_upper_bound(pattern: str, upper_bound: int = FALSE_POSITIVE_QUANTIFIER_UPPER_BOUND) -> str:
"""Set upper bound for regex patterns that use quantifiers such as `+` `*` or `{n,}`."""
def replace_upper_bound(match: re.Match) -> str: # type: ignore
lower_bound: int = int(match.group(1)) if match.group(1) else 0 # type: ignore
nonlocal upper_bound
upper_bound = upper_bound if lower_bound < upper_bound else lower_bound # type: ignore # noqa: F823
return f'{{{lower_bound},{upper_bound}}}'
pattern = re.sub(r'(?<!\\)\{(\d+),\}', replace_upper_bound, pattern) # {n,} # type: ignore
pattern = re.sub(r'(?<!\\)\+', f'{{1,{upper_bound}}}', pattern) # +
pattern = re.sub(r'(?<!\\)\*', f'{{0,{upper_bound}}}', pattern) # *
return pattern
def false_positive_check(sites_info: dict[str, dict[str, str]], site: str, pattern: str) -> QueryStatus:
"""Check if a site is likely to produce false positives."""
status: QueryStatus = QueryStatus.UNKNOWN
for _ in range(FALSE_POSITIVE_ATTEMPTS):
query_notify: QueryNotify = QueryNotify()
username: str = rstr.xeger(pattern)
result: QueryResult | str = sherlock(
username=username,
site_data=sites_info,
query_notify=query_notify,
)[site]['status']
if not hasattr(result, 'status'):
raise TypeError(f"Result for site {site} does not have 'status' attribute. Actual result: {result}")
if type(result.status) is not QueryStatus: # type: ignore
raise TypeError(f"Result status for site {site} is not of type QueryStatus. Actual type: {type(result.status)}") # type: ignore
status = result.status # type: ignore
if status in (QueryStatus.AVAILABLE, QueryStatus.WAF):
return status
return status
def false_negative_check(sites_info: dict[str, dict[str, str]], site: str) -> QueryStatus:
"""Check if a site is likely to produce false negatives."""
status: QueryStatus = QueryStatus.UNKNOWN
query_notify: QueryNotify = QueryNotify()
result: QueryResult | str = sherlock(
username=sites_info[site]['username_claimed'],
site_data=sites_info,
query_notify=query_notify,
)[site]['status']
if not hasattr(result, 'status'):
raise TypeError(f"Result for site {site} does not have 'status' attribute. Actual result: {result}")
if type(result.status) is not QueryStatus: # type: ignore
raise TypeError(f"Result status for site {site} is not of type QueryStatus. Actual type: {type(result.status)}") # type: ignore
status = result.status # type: ignore
return status
@pytest.mark.validate_targets
@pytest.mark.online
class Test_All_Targets:
@pytest.mark.validate_targets_fp
def test_false_pos(self, chunked_sites: dict[str, dict[str, str]]):
"""Iterate through all sites in the manifest to discover possible false-positive inducting targets."""
pattern: str
for site in chunked_sites:
try:
pattern = chunked_sites[site]['regexCheck']
except KeyError:
pattern = FALSE_POSITIVE_DEFAULT_PATTERN
if FALSE_POSITIVE_QUANTIFIER_UPPER_BOUND > 0:
pattern = set_pattern_upper_bound(pattern)
result: QueryStatus = false_positive_check(chunked_sites, site, pattern)
assert result is QueryStatus.AVAILABLE, f"{site} produced false positive with pattern {pattern}, result was {result}"
@pytest.mark.validate_targets_fn
def test_false_neg(self, chunked_sites: dict[str, dict[str, str]]):
"""Iterate through all sites in the manifest to discover possible false-negative inducting targets."""
for site in chunked_sites:
result: QueryStatus = false_negative_check(chunked_sites, site)
assert result is QueryStatus.CLAIMED, f"{site} produced false negative, result was {result}"

View File

@ -7,8 +7,6 @@ envlist =
py312
py311
py310
py39
py38
[testenv]
description = Attempt to build and install the package
@ -16,6 +14,7 @@ deps =
coverage
jsonschema
pytest
rstr
allowlist_externals = coverage
commands =
coverage run --source=sherlock_project --module pytest -v
@ -37,8 +36,7 @@ commands =
[gh-actions]
python =
3.13: py313
3.12: py312
3.11: py311
3.10: py310
3.9: py39
3.8: py38