如何批量查询百度指数
的有关信息介绍如下:如何批量查询百度指数?会Python用selenium搞定。
前期准备:
需要安装selenium及pyquery库。
代码示例:
from pyquery import PyQuery as pq
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time,urllib,random
word_indexs = []
options = webdriver.ChromeOptions()
options.add_argument('--headless') #设置为无界面模式,不然会报错!!
options.add_argument('user-agent=Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36')
options.add_argument(r"--user-data-dir=C:\Users\hp\AppData\Local\Google\Chrome\User Data") #获取登陆后保持的cookie
browser = webdriver.Chrome(chrome_options=options)
wait = WebDriverWait(browser, 5)
for kw in open('keywords.txt',encoding='utf-8-sig'):
kw = kw.rstrip()
word = urllib.parse.quote(kw)
newurl = 'http://index.baidu.com/v2/main/index.html#/trend/{}?words={}'.format(word,word)
browser.get(newurl)
time.sleep(random.uniform(0.5, 1.5))
try:
wait.until(EC.visibility_of_element_located((By.TAG_NAME,'tbody'))) #等到元素可见
html = browser.page_source
doc = pq(html)
indexs = doc('.veui-table .veui-table-column-right').text().split()
total_index = indexs
mobile_index = indexs
except:
total_index = 0
mobile_index = 0
index = '{}\t{}\t{}'.format(kw,total_index,mobile_index)
word_indexs.append(index+'\n')
print(index)
with open('百度指数查询结果.txt','w',encoding='utf-8') as f:
f.writelines(word_indexs)
参数说明:
--user-data-dir:修改成电脑Chrome浏览器User Data文件夹所在路径
keywords.txt:关键词存放文件,一行一个