Simulate Login
March 13, 2019
1. Sessions are a series of actions that do not lose states.
2. Cookies store credentials that are used to identify users.
3. Analogous to the behavior of a human using a browser: sessions are opening to closing the browser, and cookies are remembering the login state.
4. The difference between sessions and cookies is the duration of the action, cookies are a means to achieve the maintenance of the session (cookies are stored locally, and sessions are stored on the server).
以豆瓣为例,不登陆的话只能查看前200条,登陆以后才能查看更多。
此处介绍一种最简单的用Cookies登陆的方法:
登录豆瓣,用F12打开开发者选项,复制其网址curl,复制到网址即可
import requests
cookies = {
'll': '108296',
'bid': 'oGqwhS0dpJk',
'__utmc': '30149280',
'__utmz': '30149280.1551882083.1.1.utmcsr=baidu|utmccn=(organic)|utmcmd=organic',
'ap_v': '0,6.0',
'__yadk_uid': 'W0kMrgh9ZlpqPNnT5FL0kfgxuyD6kXMZ',
'ct': 'y',
'push_noty_num': '0',
'push_doumail_num': '0',
'__utmv': '30149280.19294',
'_vwo_uuid_v2': 'D33A2E1A29CC9FCA6BE884FB805B368F0|ad06a8cafd8eb39e2599bc4b2f56acc2',
'_pk_ref.100001.4cf6': '%5B%22%22%2C%22%22%2C1551886463%2C%22https%3A%2F%2Fwww.baidu.com%2Flink%3Furl%3DwfLH0GK-RPusGtUtWxQNSYPrD6JxOH86WqM9u4baCxH1DjNZaqclMIA2JNf2Od9Q%26wd%3D%26eqid%3Dae4bc6990000c8cf000000045c7fd762%22%5D',
'_pk_ses.100001.4cf6': '*',
'__utma': '30149280.1469262366.1551882083.1551882083.1551886511.2',
'__utmb': '223695111.0.10.1551886511',
'dbcl2': '192945996:dp12fKwI4l0',
'ck': 'LfPJ',
'__utmt': '1',
'_pk_id.100001.4cf6': '342d14aef8265ed7.1551882083.2.1551887369.1551884165.',
}
headers = {
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'zh-CN,zh;q=0.9',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36',
'Accept': '*/*',
'Referer': 'https://movie.douban.com/subject/27041389/comments?start=240&limit=20&sort=new_score&status=P',
'X-Requested-With': 'XMLHttpRequest',
'Connection': 'keep-alive',
}
params = (
('start', '240'),
('limit', '20'),
('sort', 'new_score'),
('status', 'P'),
#('comments_only', '1'),
)
response = requests.get('https://movie.douban.com/subject/27041389/comments', headers=headers, params=params, cookies=cookies)
print(response.text)