python爬虫学习-定制请求头
请求头Headers提供了关于请求、响应或其他发送实体的信息。
下面以某书上的教程为例,查找正确的请求头。
通过chrome浏览器的”检查“命令,单击Network选项,在左侧的资源中找到需要请求的网页,单击需要请求的网页,在Headers中可以看到Requests Headers的详细信息。
找不到的话,记得刷新网页 。标黄为有效信息
Request URL: http://www.santostang.com/ Request Method: GET Status Code: 200 OK Remote Address: 118.25.212.192:80 Referrer Policy: strict-origin-when-cross-origin Response HeadersView source Cache-Control: no-store, no-cache, must-revalidate Connection: keep-alive Content-Encoding: gzip Content-Type: text/html; charset=UTF-8 Date: Mon, 06 Dec 2021 11:35:17 GMT Expires: Thu, 19 Nov 1981 08:52:00 GMT Pragma: no-cache Server: nginx Transfer-Encoding: chunked Vary: Accept-Encoding Request HeadersView source Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9 Accept-Encoding: gzip, deflate Accept-Language: zh-CN,zh;q=0.9 Cache-Control: max-age=0 Connection: keep-alive Cookie: trc_cookie_storage=taboola%2520global%253Auser-id%3Dbd60449e-17eb-4a23-8440-d31ae8024b66-tuct8a6187d; PHPSESSID=038ojd9o1c1fi0egnva0t29oc7; Hm_lvt_752e310cec7906ba7afeb24cd7114c48=1638699739,1638787830; Hm_lpvt_752e310cec7906ba7afeb24cd7114c48=1638790509 Host: www.santostang.com Upgrade-Insecure-Requests: 1 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.54 Safari/537.36
提取请求头中的重要的部分,可以把代码修改为第一篇博文中的那样。