StackOverflow - python, selenium, selenium-webdriver, beautifulsoup, airbnb-js-styleguide

BeautifulSoup not returning full html script from airbnb search page

I am trying to use BeautifulSoup and Selenium to scrape data from Airbnb. I want to gather each listing from this search page. This is what I have so far: from bs4 import BeautifulSoup from selenium import webdriver from selenium.webdriver.chrome.service import Service from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC from selenium.webdriver.common.by import By def scrape_page(page_url): driver_path = "C:/Users/parkj/Downloads/chromedriver_win32/chromedriver.exe" driver = webdriver.Chrome(service = Service(driver_path)) driver.get(page_url) wait = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, 'itemprop'))) soup = BeautifulSoup(driver.page_source, 'html.parser') driver.close() return soup def extract_listing(page_url): page_soup = scrape_page(page_url) listings = page_soup.find_element(By.CLASS_NAME, "itemprop") return listings page_url = "https://www.airbnb.com/s/Kyoto-Prefecture--Japan/homes?tab_id=home_tab&flexible_trip_lengths%5B%5D=one_week&refinement_paths%5B%5D=%2Fhomes&place_id=ChIJYRsf-SB0_18ROJWxOMJ7Clk&query=Kyoto%20Prefecture%2C%20Japan&date_picker_type=flexible_dates&search_type=unknown" #items = extract_listing(page_url) #process items to get all information you need, just an example #[{'name':items.select_one('[itemprop="name"]')['content'], # 'url':items.select_one('[itemprop="url"]')['content']} # for i in items] test = scrape_page(page_url) test It seems like scrape_page( ) returns the HTML script from the search page, but does not contain the full content. It does not include the information I need, which is this part of the HTML: Image of HTML Script I did some research and I saw that WebDriverWait might help, but I get a TimeoutException Error. TimeoutException Error The end goal is to get each listing's name and URL. The first 3 items in the resulting list should look similar to this: [{'name': '✿Kyoto✿/Near Station & Bus/Temple/Twin Room(^^♪✿✿', 'url': 'www.airbnb.com/rooms/50290730?adults=1&children=0&infants=0&check_in=2022-07-20&check_out=2022-07-27&previous_page_section_name=1000'}, {'name': 'Stay in Kyoto central island', 'url': 'www.airbnb.com/rooms/42780789?adults=1&children=0&infants=0&check_in=2022-06-21&check_out=2022-06-28&previous_page_section_name=1000'}, {'name': '和楽庵【Single】100 Year old Machiya Guest House (1pax)', 'url': 'www.airbnb.com/rooms/48645312?adults=1&children=0&infants=0&check_in=2022-07-27&check_out=2022-08-03&previous_page_section_name=1000'}] I apologize ahead if I did not include enough information in this question, as this is my first time posting here. I would appreciate any help, thank you.

Was this helpful?

Have a different question?

Can't find the answer you're looking for? Submit your own question to our community.

🛎️ Get Weekly OTA Fixes

New answers, vendor issues, and updates — straight to your inbox.