序章

遇事不决直接Selenium!

优点是泛用性强,缺点是慢,非常慢。

最后还是通过分析接口解决的,很多接口只是一个规则,注意观察

准备

  • 安装selenium
  • 下载对应Chrome版本的Driver

初始化模板

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
import time,os,shutil
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options = Options()
#warnings.simplefilter('ignore',ResourceWarning)
prefs = {"profile.managed_default_content_settings.images": 2,'permissions.default.stylesheet':2}
options.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(options=options, executable_path='chromedriver.exe')

driver.maximize_window() # 最大化浏览器
print('最大化浏览器')

显式等待

1
WebDriverWait(driver, 100).until(EC.presence_of_element_located((By.XPATH, '/html/body/div[1]/app-root/section/app-search/div/div[2]/div[2]/div[1]/search-result-card/div/a')))

Xpath 定位

1
elemenet=driver.find_element(By.XPATH,f"/html/body/div[1]/app-root/section/app-search/div/div[2]/div[2]/div[{i}]/search-result-card/div/a")

获取属性

1
get_attribute("href")