import requests
from bs4 import BeautifulSoup
import pandas as pd

result = []
jour = []
for page in range(10):
    raw = requests.get('https://search.naver.com/search.naver?&where=news&query=%EC%82%BC%EC%84%B1%EC%A0%84%EC%9E%90' + str(page * 10 + 1), headers={'User-Agent': 'Mozilla/5.0'}).text
    html = BeautifulSoup(raw, 'html.parser')
    articles = html.select('.type01 > li')

    for article in articles:
        journal = article.select_one('span._sp_each_source').text
        title = article.select_one('a._sp_each_title').text
        result.append(title)
        jour.append(jour)

    print('다음페이지')

다음페이지
다음페이지
다음페이지
다음페이지
다음페이지
다음페이지
다음페이지
다음페이지
다음페이지
다음페이지

a = pd.DataFrame({'title' : title, 'journal' : jour})
print(len(a))

ubuntu18.04 XGBoost 설치 (0)	2022.03.17
파이썬으로 동영상 파일 처리 방법 (0)	2021.04.02
freeze_support() error 해결 (0)	2021.03.25
[AWS] AWS EC2 시작부터 jupyter notebook 설치 (0)	2020.09.26
[웹크롤링]네이버 주식뉴스 크롤링 시도 (0)	2020.08.14

나의 공부기록

[웹크롤링] 파이썬으로 삼성주식뉴스 가져오기

'프로그래밍 > Python' 카테고리의 다른 글

댓글

티스토리툴바

[웹크롤링] 파이썬으로 삼성주식뉴스 가져오기

'프로그래밍 > Python' 카테고리의 다른 글

관련글

댓글

티스토리툴바