Simulate Login
March 13, 2019
1. Sessions are a series of actions that do not lose states.
2. Cookies store credentials that are used to identify users.
3. Analogous to the behavior of a human using a browser: sessions are opening to closing the browser, and cookies are remembering the login state.
4. The difference between sessions and cookies is the duration of the action, cookies are a means to achieve the maintenance of the session (cookies are stored locally, and sessions are stored on the server).
以豆瓣为例,不登陆的话只能查看前200条,登陆以后才能查看更多。
此处介绍一种最简单的用Cookies登陆的方法:
登录豆瓣,用F12打开开发者选项,复制其网址curl,复制到网址即可
import requests cookies = { 'll': '108296', 'bid': 'oGqwhS0dpJk', '__utmc': '30149280', '__utmz': '30149280.1551882083.1.1.utmcsr=baidu|utmccn=(organic)|utmcmd=organic', 'ap_v': '0,6.0', '__yadk_uid': 'W0kMrgh9ZlpqPNnT5FL0kfgxuyD6kXMZ', 'ct': 'y', 'push_noty_num': '0', 'push_doumail_num': '0', '__utmv': '30149280.19294', '_vwo_uuid_v2': 'D33A2E1A29CC9FCA6BE884FB805B368F0|ad06a8cafd8eb39e2599bc4b2f56acc2', '_pk_ref.100001.4cf6': '%5B%22%22%2C%22%22%2C1551886463%2C%22https%3A%2F%2Fwww.baidu.com%2Flink%3Furl%3DwfLH0GK-RPusGtUtWxQNSYPrD6JxOH86WqM9u4baCxH1DjNZaqclMIA2JNf2Od9Q%26wd%3D%26eqid%3Dae4bc6990000c8cf000000045c7fd762%22%5D', '_pk_ses.100001.4cf6': '*', '__utma': '30149280.1469262366.1551882083.1551882083.1551886511.2', '__utmb': '223695111.0.10.1551886511', 'dbcl2': '192945996:dp12fKwI4l0', 'ck': 'LfPJ', '__utmt': '1', '_pk_id.100001.4cf6': '342d14aef8265ed7.1551882083.2.1551887369.1551884165.', } headers = { 'Accept-Encoding': 'gzip, deflate, br', 'Accept-Language': 'zh-CN,zh;q=0.9', 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36', 'Accept': '*/*', 'Referer': 'https://movie.douban.com/subject/27041389/comments?start=240&limit=20&sort=new_score&status=P', 'X-Requested-With': 'XMLHttpRequest', 'Connection': 'keep-alive', } params = ( ('start', '240'), ('limit', '20'), ('sort', 'new_score'), ('status', 'P'), #('comments_only', '1'), ) response = requests.get('https://movie.douban.com/subject/27041389/comments', headers=headers, params=params, cookies=cookies) print(response.text)