17

I'm quite new to programming in Python.

I want to make an application which will fetch stock prices from google finance. One example is CSCO (Cisco Sytems). I would then use that data to warn the user when the stock reaches a certain value. It also needs to refresh every 30 seconds.

The problem is I dont have a clue how to fetch the data!

Anyone have any ideas?

Ben Hoffstein
  • 98,117
  • 8
  • 99
  • 119
Alex
  • 1,649
  • 3
  • 13
  • 9

8 Answers8

16

This module comes courtesy of Corey Goldberg.

Program:

import urllib
import re

def get_quote(symbol):
    base_url = 'http://finance.google.com/finance?q='
    content = urllib.urlopen(base_url + symbol).read()
    m = re.search('id="ref_694653_l".*?>(.*?)<', content)
    if m:
        quote = m.group(1)
    else:
        quote = 'no quote available for: ' + symbol
    return quote

Sample Usage:

import stockquote
print stockquote.get_quote('goog')

Update: Changed the regular expression to match Google Finance's latest format (as of 23-Feb-2011). This demonstrates the main issue when relying upon screen scraping.

Ben Hoffstein
  • 98,117
  • 8
  • 99
  • 119
13

As for now (2015), the google finance api is deprecated. But you may use the pypi module googlefinance.

Install googlefinance

$pip install googlefinance

It is easy to get current stock price:

>>> from googlefinance import getQuotes
>>> import json
>>> print json.dumps(getQuotes('AAPL'), indent=2)
[
  {
    "Index": "NASDAQ", 
    "LastTradeWithCurrency": "129.09", 
    "LastTradeDateTime": "2015-03-02T16:04:29Z", 
    "LastTradePrice": "129.09", 
    "Yield": "1.46", 
    "LastTradeTime": "4:04PM EST", 
    "LastTradeDateTimeLong": "Mar 2, 4:04PM EST", 
    "Dividend": "0.47", 
    "StockSymbol": "AAPL", 
    "ID": "22144"
  }
]

Google finance is a source that provides real-time stock data. There are also other APIs from yahoo, such as yahoo-finance, but they are delayed by 15min for NYSE and NASDAQ stocks.

howard.h
  • 738
  • 1
  • 7
  • 14
hongtao
  • 396
  • 3
  • 7
2
import urllib
import re

def get_quote(symbol):
    base_url = 'http://finance.google.com/finance?q='
    content = urllib.urlopen(base_url + symbol).read()
    m = re.search('id="ref_(.*?)">(.*?)<', content)
    if m:
        quote = m.group(2)
    else:
        quote = 'no quote available for: ' + symbol
    return quote

I find that if you use ref_(.*?) and use m.group(2) you will get a better result as the reference id changes from stock to stock.

Peter Party Bus
  • 1,926
  • 12
  • 15
2

I suggest using the HTMLParser to get the value of the meta tags google places in it's html

<meta itemprop="name"
        content="Cerner Corporation" />
<meta itemprop="url"
        content="https://www.google.com/finance?cid=92421" />
<meta itemprop="imageUrl"
        content="https://www.google.com/finance/chart?cht=g&q=NASDAQ:CERN&tkr=1&p=1d&enddatetime=2014-04-09T12:47:31Z" />
<meta itemprop="tickerSymbol"
        content="CERN" />
<meta itemprop="exchange"
        content="NASDAQ" />
<meta itemprop="exchangeTimezone"
        content="America/New_York" />
<meta itemprop="price"
        content="54.66" />
<meta itemprop="priceChange"
        content="+0.36" />
<meta itemprop="priceChangePercent"
        content="0.66" />
<meta itemprop="quoteTime"
        content="2014-04-09T12:47:31Z" />
<meta itemprop="dataSource"
        content="NASDAQ real-time data" />
<meta itemprop="dataSourceDisclaimerUrl"
        content="//www.google.com/help/stock_disclaimer.html#realtime" />
<meta itemprop="priceCurrency"
        content="USD" />

With code like this:

import urllib
try:
    from html.parser import HTMLParser
except:
    from HTMLParser import HTMLParser

class QuoteData:
    pass

class GoogleFinanceParser(HTMLParser):
    def __init__(self):
        HTMLParser.__init__(self)
        self.quote = QuoteData()
        self.quote.price = -1

    def handle_starttag(self, tag, attrs):
        if tag == "meta":
            last_itemprop = ""
            for attr, value in attrs:
                if attr == "itemprop":
                    last_itemprop = value

                if attr == "content" and last_itemprop == "name":
                    self.quote.name = value
                if attr == "content" and last_itemprop == "price":
                    self.quote.price = value
                if attr == "content" and last_itemprop == "priceCurrency":
                    self.quote.priceCurrency = value
                if attr == "content" and last_itemprop == "priceChange":
                    self.quote.priceChange = value
                if attr == "content" and last_itemprop == "priceChangePercent":
                    self.quote.priceChangePercent = value
                if attr == "content" and last_itemprop == "quoteTime":
                    self.quote.quoteTime = value
                if attr == "content" and last_itemprop == "exchange":
                    self.quote.exchange = value
                if attr == "content" and last_itemprop == "exchangeTimezone":
                    self.quote.exchangeTimezone = value


def getquote(symbol):
    url = "http://finance.google.com/finance?q=%s" % symbol
    content = urllib.urlopen(url).read()

    gfp = GoogleFinanceParser()
    gfp.feed(content)
    return gfp.quote;


quote = getquote('CSCO')
print quote.name, quote.price
Chad Skeeters
  • 1,380
  • 14
  • 17
1

Just in case you want to pull data from Yahoo... Here is a simple function. This does not scrape data off a normal page. I thought I had a link to the page describing this in the comments, but I do not see it now - there is a magic string appended to the URL to request specific fields.

import urllib as u
import string
symbols = 'amd ibm gm kft'.split()

def get_data():
    data = []
    url = 'http://finance.yahoo.com/d/quotes.csv?s='
    for s in symbols:
        url += s+"+"
    url = url[0:-1]
    url += "&f=sb3b2l1l"
    f = u.urlopen(url,proxies = {})
    rows = f.readlines()
    for r in rows:
        values = [x for x in r.split(',')]
        symbol = values[0][1:-1]
        bid = string.atof(values[1])
        ask = string.atof(values[2])
        last = string.atof(values[3])
        data.append([symbol,bid,ask,last,values[4]])
    return data

Here, I found the link that describes the magic string: http://cliffngan.net/a/13

phkahler
  • 5,474
  • 1
  • 20
  • 31
  • 1
    There is also a Yahoo data fetcher built in to the Python Pandas library ([link](http://www.statalgo.com/2011/09/08/pandas-getting-financial-data-from-yahoo-fred-etc/)) (and Federal Reserve and Fama/French data libraried too). The current specifications may become deprecated in favor of a more robust data query system, but I think Pandas is the way to go for this stuff. – ely Jul 27 '12 at 17:24
0

http://docs.python.org/library/urllib.html for fetching arbitrary URLs.

Apart from that you should better look a some web service providing the data in JSON format.

Otherwise you have to implement parsing etc. on your own.

Screenscrapping yahoo.com for getting the stocks is unlikely the right road to success.

Andreas Jung
  • 1
  • 18
  • 70
  • 118
0

You can start by looking at the Google Finance APIs, although I don't see a Python API or wrapper. It looks like the only options for accessing the data directly are Java and JavaScript. You can also use cURL if you're familiar with it and it's available on your system.

Bill the Lizard
  • 369,957
  • 201
  • 546
  • 842
0

Another good place to start is Google Finance's own API: http://code.google.com/apis/finance/ You can look at their finance gadgets for some example code.

Ryan Castillo
  • 1,115
  • 10
  • 21