How to retrieve the list of values from a drop down list

Question

I am trying to retrieve the list of available option expiries for a given ticker on yahoo finance. For instance using SPY as ticker on https://finance.yahoo.com/quote/SPY/options

The list of expiries are in the drop down list:

<div class="Fl(start) Pend(18px) option-contract-control drop-down-selector" data-reactid="4"> 
    <select class="Fz(s)" data-reactid="5"> 
        <option selected="" value="1576627200" data-reactid="6">December 18, 2019</option> 
        <option value="1576800000" data-reactid="7">December 20, 2019</option> 
        <option value="1577059200" data-reactid="8">December 23, 2019</option> 
        ...
    < / select > 
< / div >

Using the div class name (or the select class name, but there seems to be several of these on the page), I get the list of values as a single string of concatenated expiries.

My function (I pass on ticker='SPY' from the main function):

def get_list_expiries(ticker):
    browser = webdriver.Chrome()
    options_url = "https://finance.yahoo.com/quote/" + str(ticker) + "/options"
    browser.get(options_url)
    html_source = browser.page_source
    soup = BeautifulSoup(html_source, 'html.parser')
    expiries_dt = []


    for exp in soup.find_all(class_="Fl(start) Pend(18px) option-contract-control drop-down-selector"):
        expiries_dt.append(exp.text)

    browser.quit()
    return expiries_dt

This produces:

['December 18, 2019December 20, 2019December 23, 2019December 24, 2019December 27, 2019December 30, 2019...']

I understand I need to use selenium for this but I can't figure out how. The result is always a list of a single string. Ideally I would like to return two lists: one with the unix datestamp (option value="1576627200") and another list with the 'normal' dates (ie 18/12/2019).

Any help will be greatly appreciated.

You don't need selenium. It is faster without selenium. I have given a requests answer below to show how. — QHarr, Dec 19 '19 at 07:09

score 1 · Accepted Answer · answered Dec 18 '19 at 23:20

To extract the unix datestamp and Expiry Dates you have to induce WebDriverWait and you can use the following Locator Strategies:

Code Block:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import Select

options = webdriver.ChromeOptions() 
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')

driver.get('https://finance.yahoo.com/quote/SPY/options')
select = Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.option-contract-control.drop-down-selector>select"))))
print("Unix datestamp: ")
print([option.get_attribute("value") for option in select.options])
print("Dates: ")    
print([option.get_attribute("innerHTML") for option in select.options])

Console Output:

Unix datestamp:
['1576627200', '1576800000', '1577059200', '1577145600', '1577404800', '1577664000', '1577750400', '1578009600', '1578268800', '1578441600', '1578614400', '1578873600', '1579046400', '1579219200', '1579564800', '1579824000', '1580428800', '1582243200', '1584662400', '1585612800', '1587081600', '1589500800', '1592524800', '1593475200', '1594944000', '1600387200', '1601424000', '1602806400', '1605830400', '1606780800', '1608249600', '1610668800', '1616112000', '1623974400', '1631836800', '1639699200', '1642723200']
Dates:
['December 18, 2019', 'December 20, 2019', 'December 23, 2019', 'December 24, 2019', 'December 27, 2019', 'December 30, 2019', 'December 31, 2019', 'January 3, 2020', 'January 6, 2020', 'January 8, 2020', 'January 10, 2020', 'January 13, 2020', 'January 15, 2020', 'January 17, 2020', 'January 21, 2020', 'January 24, 2020', 'January 31, 2020', 'February 21, 2020', 'March 20, 2020', 'March 31, 2020', 'April 17, 2020', 'May 15, 2020', 'June 19, 2020', 'June 30, 2020', 'July 17, 2020', 'September 18, 2020', 'September 30, 2020', 'October 16, 2020', 'November 20, 2020', 'December 1, 2020', 'December 18, 2020', 'January 15, 2021', 'March 19, 2021', 'June 18, 2021', 'September 17, 2021', 'December 17, 2021', 'January 21, 2022']

score 0 · Answer 2 · answered Dec 19 '19 at 02:23

try use SimplifiedDoc, It's a library for extraction

from simplified_scrapy.simplified_doc import SimplifiedDoc 
html='''<div class="Fl(start) Pend(18px) option-contract-control drop-down-selector" data-reactid="4"> 
    <select class="Fz(s)" data-reactid="5"> 
        <option selected="" value="1576627200" data-reactid="6">December 18, 2019</option> 
        <option value="1576800000" data-reactid="7">December 20, 2019</option> 
        <option value="1577059200" data-reactid="8">December 23, 2019</option> 
        ...
    </select> 
</div>
'''
doc = SimplifiedDoc(html)
div = doc.getElementByClass('Fl(start) Pend(18px) option-contract-control drop-down-selector')
options = div.options # get all options
expiries_dt = [option.html for option in options]
print (expiries_dt) # ['December 18, 2019', 'December 20, 2019', 'December 23, 2019']

score 0 · Answer 3 · answered Dec 19 '19 at 07:07

You don't need selenium for this bit at least (and to be honest for most Yahoo finance info it is overkill). You can regex out timestamps from response text (converting string representation of list returned to actual list with ast) and use datetime module to convert to required date format.

import requests, re, ast
from datetime import datetime

r = requests.get('https://finance.yahoo.com/quote/SPY/options?guccounter=1')
p = re.compile(r'"expirationDates":(\[.*?\])')
timestamps = ast.literal_eval(p.findall(r.text)[0])
dates = [datetime.utcfromtimestamp(ts).strftime("%B %d, %Y") for ts in timestamps]

Regex explanation:

Datetime conversions:

See discussion by @jfs which is where I saw utcfromtimestamp originally
strftime

How to retrieve the list of values from a drop down list

3 Answers3

Related