Can a website detect when you are using Selenium with chromedriver?

Can a website detect when you are using Selenium with chromedriver?

题意:一个网站能检测到你使用 Selenium 和 Chromedriver 吗?

问题背景:

I've been testing out Selenium with Chromedriver and I noticed that some pages can detect that you're using Selenium even though there's no automation at all. Even when I'm just browsing manually just using Chrome through Selenium and Xephyr I often get a page saying that suspicious activity was detected. I've checked my user agent, and my browser fingerprint, and they are all exactly identical to the normal Chrome browser.

我最近在测试 Selenium 和 Chromedriver,发现有些页面即使没有进行任何自动化操作,也能检测到你正在使用 Selenium。即使我只是通过 Selenium 和 Xephyr 手动浏览页面,也经常会遇到提示“检测到可疑活动”的情况。我检查了我的用户代理和浏览器指纹,发现它们与正常的 Chrome 浏览器完全一致。

When I browse to these sites in normal Chrome everything works fine, but the moment I use Selenium I'm detected.

当我用正常的 Chrome 浏览这些网站时,一切都正常,但一旦使用 Selenium,就会被检测出来。

In theory, chromedriver and Chrome should look literally exactly the same to any web server, but somehow they can detect it.

理论上,Chromedriver 和 Chrome 对任何 Web 服务器来说应该看起来完全一样,但它们却能以某种方式检测出来。

If you want some test code try out this:

如果你需要一些测试代码,可以试试以下内容:

from pyvirtualdisplay import Display
from selenium import webdriver

display = Display(visible=1, size=(1600, 902))
display.start()
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--disable-extensions')
chrome_options.add_argument('--profile-directory=Default')
chrome_options.add_argument("--incognito")
chrome_options.add_argument("--disable-plugins-discovery");
chrome_options.add_argument("--start-maximized")
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.delete_all_cookies()
driver.set_window_size(800,800)
driver.set_window_position(0,0)
print 'arguments done'
driver.get('http://stubhub.***')

If you browse around stubhub you'll get redirected and 'blocked' within one or two requests. I've been investigating this and I can't figure out how they can tell that a user is using Selenium.

如果你在 StubHub 上随便浏览,一两次请求后就会被重定向并“阻止”。我一直在研究这个问题,但仍然无法弄清楚他们是如何判断用户正在使用 Selenium 的。

How do they do it?

他们是怎么做到的?

I installed the Selenium IDE plugin in Firefox and I got banned when I went to stubhub.*** in the normal Firefox browser with only the additional plugin.

我在 Firefox 中安装了 Selenium IDE 插件,然后只带着这个插件用正常的 Firefox 浏览器访问 stubhub.*** 时就被封禁了。

When I use Fiddler to view the HTTP requests being sent back and forth I've noticed that the 'fake browser's' requests often have 'no-cache' in the response header.

当我使用 Fiddler 查看往返的 HTTP 请求时,我注意到“假浏览器”的请求响应头中经常包含“no-cache”。

Results like this 像这样的结果: Is there a way to detect that I'm in a Selenium Webdriver page from JavaScript? suggest that there should be no way to detect when you are using a webdriver. But this evidence suggests otherwise.   这样的结果表明应该没有办法检测到你在使用 WebDriver,但这些证据却表明情况并非如此。

The site uploads a fingerprint to their servers, but I checked and the fingerprint of Selenium is identical to the fingerprint when using Chrome.

该网站将指纹上传到他们的服务器,但我检查过,Selenium 的指纹与使用 Chrome 时的指纹完全相同。

This is one of the fingerprint payloads that they send to their servers:

这是他们发送到服务器的其中一个指纹负载:

{"appName":"***scape","platform":"Linuxx86_64","cookies":1,"syslang":"en-US","userlang":"en-
US","cpu":"","productSub":"20030107","setTimeout":1,"setInterval":1,"plugins":
{"0":"ChromePDFViewer","1":"ShockwaveFlash","2":"WidevineContentDecryptionMo
dule","3":"NativeClient","4":"ChromePDFViewer"},"mimeTypes":
{"0":"application/pdf","1":"ShockwaveFlashapplication/x-shockwave-
flash","2":"FutureSplashPlayerapplication/futuresplash","3":"WidevineContent
DecryptionModuleapplication/x-ppapi-widevine-
cdm","4":"NativeClientExecutableapplication/x-
nacl","5":"PortableNativeClientExecutableapplication/x-
pnacl","6":"PortableDocumentFormatapplication/x-google-chrome-
pdf"},"screen":{"width":1600,"height":900,"colorDepth":24},"fonts":
{"0":"monospace","1":"DejaVuSerif","2":"Georgia","3":"DejaVuSans","4":"Trebu
chetMS","5":"Verdana","6":"AndaleMono","7":"DejaVuSansMono","8":"LiberationM
ono","9":"NimbusMonoL","10":"CourierNew","11":"Courier"}}

It's identical in Selenium and in Chrome.

在 Selenium 和 Chrome 中是完全相同的。

VPNs work for a single use, but they get detected after I load the first page. Clearly some JavaScript code is being run to detect Selenium.

VPN 在第一次使用时有效,但在加载第一页后会被检测到。显然,有一些 JavaScript 代码在运行,用于检测 Selenium。

问题解决:

Replacing cdc_ string        替换 cdc_ 字符串

You can use Vim or Perl to replace the cdc_ string in chromedriverSee the answer by @Erti-Chris Eelmaa to learn more about that string and how it's a detection point.

你可以使用 Vim 或 Perl 来替换 chromedriver 中的 cdc_ 字符串。查看 @Erti-Chris Eelmaa 的回答,了解更多关于该字符串的信息以及它是如何成为检测点的。

Using Vim or Perl prevents you from having to re***pile source code or use a hex editor.

使用 Vim 或 Perl 可以避免你重新编译源代码或使用十六进制编辑器。

Make sure to make a copy of the original chromedriver before attempting to edit it.

在尝试编辑 chromedriver 之前,确保先备份原始的 chromedriver。

Our goal is to alter the cdc_ string, which looks something like $cdc_lasutopfhvcZLmcfl.

我们的目标是修改 cdc_ 字符串,它看起来像 $cdc_lasutopfhvcZLmcfl 这样。

The methods below were tested on chromedriver version 2.41.578706.

以下方法是在 chromedriver 版本 2.41.578706 上测试的。


Using Vim

vim -b /path/to/chromedriver

After running the line above, you'll probably see a bunch of gibberish. Do the following:

运行上面的命令后,你可能会看到一堆乱码。请按照以下步骤操作:

  1. Replace all instances of cdc_ with dog_ by typing :%s/cdc_/dog_/g.   通过输入 `:%s/cdc_/dog_/g` 将所有的 cdc_ 替换为 dog_。
    • dog_ is just an example. You can choose anything as long as it has the same amount of characters as the search string (e.g., cdc_), otherwise the chromedriver will fail.   dog_ 只是一个示例。你可以选择任何字符串,只要它的字符数与搜索字符串(例如 cdc_)相同,否则 chromedriver 会失败。
  2. To save the changes and quit, type :wq! and press return.   要保存更改并退出,输入 `:wq!` 然后按回车。
    • If you need to quit without saving changes, type :q! and press return.   如果你需要退出而不保存更改,输入 `:q!` 然后按回车。

The -b option tells vim upfront to open the file as a binary, so it won't mess with things like (missing) line endings (especially at the end of the file).

`-b` 选项告诉 Vim 以二进制模式打开文件,这样它就不会修改像(缺失的)行结束符(尤其是在文件末尾)之类的内容。


Using Perl

The line below replaces all cdc_ o***urrences with dog_. Credit to 下面的命令将所有的 `cdc_` 替换为 `dog_`。感谢以下来源: Vic Seedoubleyew:

perl -pi -e 's/cdc_/dog_/g' /path/to/chromedriver

Make sure that the replacement string (e.g., dog_) has the same number of characters as the search string (e.g., cdc_), otherwise the chromedriver will fail.

确保替换字符串(例如 dog_)与搜索字符串(例如 cdc_)的字符数相同,否则 chromedriver 将无法运行。


Wrapping Up        总结

To verify that all o***urrences of cdc_ were replaced:

验证所有出现的 cdc_ 是否已被替换

grep "cdc_" /path/to/chromedriver

If no output was returned, the replacement was su***essful.

如果没有返回输出,则替换已成功。

Go to the altered chromedriver and double click on it. A terminal window should open up. If you don't see killed in the output, you've su***essfully altered the driver.

打开修改后的 chromedriver 并双击它。一个终端窗口应该会打开。如果输出中没有看到 killed 字样,说明你已成功修改了驱动程序。

Make sure that the name of the altered chromedriver binary is chromedriver, and that the original binary is either moved from its original location or renamed.

确保修改后的 chromedriver 二进制文件名称为 chromedriver,并且原始文件已从其原始位置移走或被重命名。


My Experience With This Method

我对这种方法的体验

I was previously being detected on a website while trying to log in, but after replacing cdc_ with an equal sized string, I was able to log in. Like others have said though, if you've already been detected, you might get blocked for a plethora of other reasons even after using this method. So you may have to try a***essing the site that was detecting you using a VPN, different ***work, etc.

之前,我在尝试登录某个网站时被检测到,但在将 cdc_ 替换为长度相同的字符串后,我成功登录了。然而,正如其他人所说,如果你已经被检测到,即使使用了这种方法,也可能因其他各种原因被阻止。因此,你可能需要尝试使用 VPN、不同的网络等方式来访问检测你的站点。

转载请说明出处内容投诉
CSS教程网 » Can a website detect when you are using Selenium with chromedriver?

发表评论

欢迎 访客 发表评论

一个令你着迷的主题!

查看演示 官网购买