[Web/Crawling] BeautifulSoup

BeautifulSoup

HTML정보로 부터 원하는 데이터를 가져오기 쉽게, 비슷한 분류의 데이터별로 나누어주는(parsing) 파이썬 라이브러리로 정적 크롤링에 사용이 됩니다.

설치 방법

pip install requests
pip install beautifulsoup4

위 코드로 설치를 합니다.

request 라이브러리란
Python용 HTTP 라이브러리이다. 특정 웹사이트에 HTTP 요청을 보내 HTML 문서를 받아올 수 있는 라이브러리로, BeautifulSoup에 의해 살아있는 HTML 문서로 바뀌게 된다.

사용 예시

from bs4 import BeautifulSoup as BS
import requests as req

url = "https://finance.naver.com/sise/lastsearch2.naver"
res = req.get(url)
soup = BS(res.text, "html.parser")

for tr in soup.select("table.type_5 tr"):
    if len(tr.select("a.tltle")) == 0:
        continue
    title = tr.select("a.tltle")[0].get_text(strip=True)
    price = tr.select("td.number:nth-child(4)")[0].get_text(strip=True)
    change = tr.select("td.number:nth-child(6)")[0].get_text(strip=True)

    print(title, price, change)

Reference

https://hi-guten-tag.tistory.com/5

'코딩 공부 > Python' 카테고리의 다른 글

[백준/파이썬] 33277번 국방시계 (0)	2025.02.19
[Web/Crawling] Selenium (0)	2023.07.10
[Pyside6] PyQt (0)	2023.07.10
[Selenium/Docker] AttributeError: 'NoneType' object has no attribute 'to_capabilities' (0)	2023.07.03
크롤링 (0)	2023.06.21

[Web/Crawling] BeautifulSoup

BeautifulSoup

설치 방법

사용 예시

Reference

'코딩 공부 > Python' 카테고리의 다른 글

관련글

티스토리툴바