[Python] BeautifulSoup4

프로그래밍언어/Python3

[Python] BeautifulSoup4

skokieh 2021. 3. 25. 18:29

728x90

import requests
from bs4 import BeatifulSoup
import re


#자료구조(타이틀, 평점, 리뷰들을 담을 리스트를 만들어준다)
movie_title = []
movie_point = []
movie_review = []

#1이상 11미만범위로 10번
for n in range(1, 11): 
    req = request.get('https://movie.naver.com/movie/point/af/list.nhn?page='+str(n))
    html = req.text
    soup = BeautifulSoup(html, 'html.parser')

# DOM트리구조에서 원하는 데이터 뽑아오기 .(dot)은 클래스, #은 id
   titles = soup.select('.movie') 
points = soup.select('td.title > div > em')
reviews = soup.select('td.title')

# 뽑아온것들 리스트에 추가 해놓기
for dom in titles :
    moive_title.append(dom.text)
for dom in points : 
    movie_points.append(dom.text)
for dom in reviews:
    print(dom)
    content = dom.content[6]                #공백포함하여 인덱스0부터 인덱스6 7nth에 있는 겟
    content = re.sub("[\n\t]", "", content)    #해당 패턴(정규표현식,RegularExpression,re)을 사용해서 개행, 탭 문자제거하기 
    content = re.sub("신고", "", content)    #필요없는 문자열 날리고
    movie_review.append(content)



#이제는 보여주기        
for i in range(len(moview_title)):
    print('영화제목: ' movie_title[i])
    print('평점: ', movie_title[i))
    print('리뷰: ', movie_review[i])




#Error
HTTPConnectionError났었는데 요청할 URL주소의 오타로 인해 max retries초과하였음. 수정 후 제가동

태그(엘리먼트/노드/객체/오브젝트)와 태그 사이에 공백이 있다는 거 처음 알았다

저작자표시 동일조건 (새창열림)

'프로그래밍언어 > Python3' 카테고리의 다른 글

[Python3] UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbc in position 0: invalid start byte (0)	2021.05.03
[Python] 메일 검색, 페이로드 (0)	2021.04.22
[Python] 판다스(Pandas) - Series클래스 정리 (0)	2021.03.31
[Python] 2021-03-30 (0)	2021.03.30
[Python] 정규표현식 정리 (0)	2021.03.29

현재글[Python] BeautifulSoup4

suppose

hsk, 컴퓨터구조, Linux, 도커, 정보처리기사, 이더리움, 중국어, 리눅스, network, VPN, 정보보안기사, 블록체인, SQL, Solidity, 솔리디티, 인텔리제이, BCT, react, 네트워크, 운영체제,

Today :
Yesterday :

suppose

[Python] BeautifulSoup4

'프로그래밍언어 > Python3' 카테고리의 다른 글

'프로그래밍언어/Python3'의 다른글

티스토리툴바

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

[Python] BeautifulSoup4

'프로그래밍언어 > Python3' 카테고리의 다른 글

'프로그래밍언어/Python3'의 다른글

관련글

티스토리툴바