요기요 크롤링 하기

gw_kim 2022. 3. 31. 21:58

2022. 3. 31. 21:58

회사 동료의 요청으로 요기요 데이터를 크롤링 해봤다.(급하게..)

(에러 처리 코드 안되어있음,코드 더러움 주의)

# 라이브러리 import 
from bs4 import BeautifulSoup
import requests
import time 
import random 
from tqdm import tqdm 
from collections import defaultdict 
import json
import pandas as pd

# 파라미터 및 관련 변수 설정
# 원하는 주소의 위 , 경도 주소가 필요!

url = 'https://www.yogiyo.co.kr/api/v1/restaurants-geo/'
api_key = ## 웹페이지의 개발코드를 살펴보면 알수있다.(공식 절차는 아니므로..)

## 특정 위경도의 "치킨" 카테고리를 조회 
parameters = {"category" : "치킨" ,
#              "items" : 20 ,
            "lat" : 37.5565050755347,
            "lng" : 126.939656244325,
#             "order" : 'review_count', ## 정렬 기준
            "page" : 0
             }
headers = {'X-ApiSecret':api_key,'X-ApiKey':"iphoneap"}
res = requests.get(url,params=parameters, headers=headers)
info = json.loads(res.content)


##  필요한 항목 
key_list =["id" , "slug" , "address" , "estimated_delivery_time" ,"min_order_amount"]
## dataframe 만들기 
df_list =[]
for idx in range(len(info["restaurants"])):
    info_list=[]
    for key in key_list:
        info_list.append(info["restaurants"][idx][key])
    df_list.append(info_list)
    
## dataframe으로 표출하기 
df = pd.DataFrame(df_list , columns = key_list)

print(df.shape)  ##(280, 5)
print(df.info()) 
 
# <class 'pandas.core.frame.DataFrame'>
# RangeIndex: 280 entries, 0 to 279
# Data columns (total 5 columns):
# #   Column                   Non-Null Count  Dtype 
# ---  ------                   --------------  ----- 
#  0   id                       280 non-null    int64 
#  1   slug                     280 non-null    object
#  2   address                  280 non-null    object
#  3   estimated_delivery_time  280 non-null    object
#  4   min_order_amount         280 non-null    int64 
# dtypes: int64(2), object(3)
# memory usage: 11.1+ KB

df.head(2)

'~2023 > 99_이외' 카테고리의 다른 글

데이터 시각화 연습 - 태블로 (0)	2022.08.31
Spark runtime 단축TIP - Spark 이해하기 (2)	2022.01.02

gwkim 기술하는 블로그

요기요 크롤링 하기

'~2023 > 99_이외' 카테고리의 다른 글

+ Recent posts

티스토리툴바