I noticed lots of websites now implement anti robot from website scraping. Most of them using fingerprintjs.com technology where it uses browser users agent as part to detect whether it is a robot or human.
I use Selenium Webdriver to scrap few websites. When I browse the website, it shows the content but if using Selenium it shows empty content.
You can test the fingerprintjs here to check whether it detects human or robot.
If using Selenium for Chrome and Firefox , fingerprintjs will detect it as robots. However if using Selenium Safari, it detects Selenium as human.
Safari when launch by default is in incognito mode so maybe harder for fingerprintjs.com detect it.
Below are comparison between Selenium for Chrome, Firefox and Safari browsers user agent.
Legend:
Yellow colored rows means there are differences between Selenium and normal browsers user agent.
Chrome Browser
No | Chrome - Normal Browser User Agent | Chrome - Selenium User Agent |
---|---|---|
1 | Host: localhost | Host: localhost |
2 | Connection: keep-alive | Connection: keep-alive |
3 | Cache-Control: max-age=0 | |
4 | Upgrade-Insecure-Requests: 1 | Upgrade-Insecure-Requests: 1 |
5 | User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36 | User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36 |
6 | Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9 | Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9 |
7 | Sec-Fetch-Site: none | Sec-Fetch-Site: none |
8 | Sec-Fetch-Mode: navigate | Sec-Fetch-Mode: navigate |
9 | Sec-Fetch-User: ?1 | Sec-Fetch-User: ?1 |
10 | Sec-Fetch-Dest: document | Sec-Fetch-Dest: document |
11 | Accept-Encoding: gzip, deflate, br | Accept-Encoding: gzip, deflate, br |
12 | Accept-Language: en-US,en;q=0.9,id;q=0.8,ms;q=0.7,fr;q=0.6 | Accept-Language: en-GB,en-US;q=0.9,en;q=0.8 |
13 | Cookie: chat_uuid=1458159213.1598025760; gr_session=17411c1fd31-2125f7fe5bbd95cd; gr_reco=17411c1fd70-baa36285123101b1; utag_main=v_id:017411c1fb650017fec4094270bb03079001a07100942$_sn:4$_ss:1$_st:1598096705870$_pn:1%3Bexp-session$ses_id:1598094905870%3Bexp-session |
Firefox Browser
No | Firefox - Normal Browser User Agent | Firefox - Selenium User Agent |
---|---|---|
1 | Host: localhost | Host: localhost |
2 | User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:79.0) Gecko/20100101 Firefox/79.0 | User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:79.0) Gecko/20100101 Firefox/79.0 |
3 | Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 | Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 |
4 | Accept-Language: en-US,en;q=0.5 | Accept-Language: en-US,en;q=0.5 |
5 | Accept-Encoding: gzip, deflate | Accept-Encoding: gzip, deflate |
6 | Connection: keep-alive | Connection: keep-alive |
7 | Cookie: _ga=GA1.1.1739912853.1548032065; gr_reco=1688e9dab95-b5d644815084ad40; utag_main=v_id:01688e9dad18001ff21f0bc7b4a800052009100f0093c$_sn:7$_ss:1$_st:1594889778490$_pn:1%3Bexp-session$ses_id:1594887978490%3Bexp-session; uuid=092deff09bc14cec8f32915260380536; cto_lwid=3ff13ed0-ccf0-46a0-b3d7-8a2e1cadcb3f; cto_idcpy=752176f4-9133-4eb6-a23d-8aacb880eaee; __gads=ID=258eba7044613b5a-22a30c3f84c20074:T=1594886702:S=ALNI_MYCzBeDMjB7doBDh_F5BZztWvpLvw; chat_uuid=407415995.1568209940; cto_bundle=aJ5M6V9JZCUyRjlLVVRRMG5ucUg0RUMlMkJtYk53UXpDJTJGY0Z3b09EU1RTZHpyY0xER1o0ektySUZZdU9iQ3d5UWJsOXFqUFZ4UDlZaGF4WmR4eThta2Q2S2o1NTVFd2Ruc1lBcVNJVW1IQ1ZlN1hYWTZ1VGpCZXJIZjRnRnhnZ0lEc3VuTkg3dCUyQmtTc0ZkajQ2WnJ6JTJGR0w5T3ElMkJEa1NlbjNCZlZzc2k0eVJYVU5oOEtPTjAlM0Q; _ga_SDQJRXDGN4=GS1.1.1596669834.9.0.1596669837.0; __atuvc=0%7C28%2C1%7C29%2C0%7C30%2C0%7C31%2C1%7C32; _fbp=fb.0.1586054074356.121762323; mudahHash=ec9f3f6655e4ccc0814a616dd3ee38f75d7ce494; cto_bidid=wxF3T19jb2hMYzJ1V0dITEZHRjRxMURWMWZrSTNmaDIlMkZ3RzBxTllBczB0amZMMWFOTDg2JTJGdU8xMTQzWDNzMm9TRzh0cmJHUEZFRTR0aUJWWDZQbHA1ZFhUd042cDMlMkIxdkdpOEt6S3ZqWlZ6YUNySSUzRA; _gcl_au=1.1.919768330.1594886698; gr_session=17356a7b66b-a9389a87a3e2a370 | |
8 | Upgrade-Insecure-Requests: 1 | Upgrade-Insecure-Requests: 1 |
Safari Browser
No | Safari - Normal Browser User Agent | Safari - Selenium User Agent |
---|---|---|
1 | Host: localhost | Host: localhost |
2 | Upgrade-Insecure-Requests: 1 | Upgrade-Insecure-Requests: 1 |
3 | Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 | Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 |
4 | User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Safari/605.1.15 | User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Safari/605.1.15 |
5 | Accept-Language: en-us | Accept-Language: en-us |
6 | Accept-Encoding: gzip, deflate | Accept-Encoding: gzip, deflate |
7 | Connection: keep-alive | Connection: keep-alive |
Conclusion
Most number of differences found in Chrome and followed by Firefox. Safari doesn’t show difference between Selenium and normal browser user agent.
Does it means, it is much easier to trick anti robot using Safari?
Chrome | Firefox | Safari |
---|---|---|
3 | 1 | 0 |