Questions tagged [time-series]

A Time series is a sequence of data points with values measured at successive times (either in continuous time or at discrete time periods). Time series analysis exploits this natural temporal ordering to extract meaning and trends from the underlying data.

Time series data is data with a pattern (“trend”) over time. Quantitative forecasting can be applied when two conditions are satisfied:

  1. numerical information about the past is available;
  2. it is reasonable to assume that some aspects of the past patterns will continue into the future.

Time series data are useful when you are forecasting something that is changing over time (e.g., stock prices, sales figures, profits, etc.). Examples of time series data include:

  • Daily IBM stock prices
  • Monthly rainfall
  • Quarterly sales results for Amazon
  • Annual Google profits

https://www.otexts.org/fpp/1/4

Time series models attempt to make use of the natural one-way ordering of time so that values for a given period will be expressed as a function of past values. This same idea is used in time series forecasting — future values based on past data.

Typically, time series data points are spaced at uniform time intervals.

A time series model will generally reflect the fact that observations close together in time will be more closely related than observations further apart.

As a place to start, take a look at Wikipedia's page on time series. For further reading, refer to the Statsoft website which has an online textbook on time series analysis.

For time series analysis in , consider looking at the Time Series Task View and questions tagged for the zoo package and for the xts package.


Tag usage:

Questions on tag should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis or Data Science, the StackExchange site for Data Science related topics like time series.

12145 questions
324
votes
5 answers

Plotting two variables as lines using ggplot2 on the same graph

A very newbish question, but say I have data like this: test_data <- data.frame( var0 = 100 + c(0, cumsum(runif(49, -20, 20))), var1 = 150 + c(0, cumsum(runif(49, -10, 10))), date = seq(as.Date("2002-01-01"), by="1 month",…
fmark
  • 50,804
  • 25
  • 88
  • 106
309
votes
34 answers

Peak signal detection in realtime timeseries data

Update: The best performing algorithm so far is this one. This question explores robust algorithms for detecting sudden peaks in real-time timeseries data. Consider the following example data: Example of this data is in Matlab format (but this…
189
votes
10 answers

Storing time-series data, relational or non?

I am creating a system which polls devices for data on varying metrics such as CPU utilisation, disk utilisation, temperature etc. at (probably) 5 minute intervals using SNMP. The ultimate goal is to provide visualisations to a user of the system in…
146
votes
16 answers

How to calculate rolling / moving average using NumPy / SciPy?

There seems to be no function that simply calculates the moving average on numpy/scipy, leading to convoluted solutions. My question is two-fold: What's the easiest way to (correctly) implement a moving average with numpy? Since this seems…
loopbackbee
  • 18,788
  • 8
  • 53
  • 85
119
votes
9 answers

Can Pandas plot a histogram of dates?

I've taken my Series and coerced it to a datetime column of dtype=datetime64[ns] (though only need day resolution...not sure how to change). import pandas as pd df = pd.read_csv('somefile.csv') column = df['date'] column = pd.to_datetime(column,…
lollercoaster
  • 13,421
  • 28
  • 94
  • 162
115
votes
4 answers

How to get a vertical geom_vline to an x-axis of class date?

Even though I found Hadley's post in the google group on POSIXct and geom_vline, I could not get it done. I have a time series from and would like to draw a vertical line for years 1998, 2005 and 2010 for example. I tried with ggplot and qplot…
Matt Bannert
  • 25,237
  • 34
  • 134
  • 195
100
votes
4 answers

Generating time series between two dates in PostgreSQL

I have a query like this that nicely generates a series of dates between 2 given dates: select date '2004-03-07' + j - i as AllDate from generate_series(0, extract(doy from date '2004-03-07')::int - 1) as i, generate_series(0, extract(doy from…
f.ashouri
  • 4,839
  • 12
  • 38
  • 52
94
votes
9 answers

Pandas: rolling mean by time interval

I've got a bunch of polling data; I want to compute a Pandas rolling mean to get an estimate for each day based on a three-day window. According to this question, the rolling_* functions compute the window based on a specified number of values, and…
Anov
  • 1,934
  • 1
  • 18
  • 25
87
votes
2 answers

How to parse milliseconds?

How do I use strptime or any other functions to parse time stamps with milliseconds in R? time[1] # [1] "2010-01-15 13:55:23.975" strptime(time[1], format="%Y-%m-%d %H:%M:%S.%f") # [1] NA strptime(time[1], format="%Y-%m-%d %H:%M:%S") # [1]…
signalseeker
  • 3,950
  • 6
  • 28
  • 36
75
votes
8 answers

auto.arima() equivalent for python

I am trying to predict weekly sales using ARMA ARIMA models. I could not find a function for tuning the order(p,d,q) in statsmodels. Currently R has a function forecast::auto.arima() which will tune the (p,d,q) parameters. How do I go about…
Ajax
  • 1,438
  • 3
  • 17
  • 26
66
votes
5 answers

Pattern recognition in time series

By processing a time series graph, I Would like to detect patterns that look similar to this: Using a sample time series as an example, I would like to be able to detect the patterns as marked here: What kind of AI algorithm (I am assuming…
Ali
  • 847
  • 1
  • 9
  • 10
63
votes
6 answers

Is there a powerful database system for time series data?

In multiple projects we have to store, aggregate, evaluate simple measurement values. One row typcially consists of a time stamp, a value and some attributes to the value. In some applications we would like to store 1000 values per second and more.…
Kit Fisto
  • 4,080
  • 4
  • 22
  • 38
61
votes
8 answers

Resampling Within a Pandas MultiIndex

I have some hierarchical data which bottoms out into time series data which looks something like this: df = pandas.DataFrame( {'value_a': values_a, 'value_b': values_b}, index=[states, cities, dates]) df.index.names = ['State', 'City',…
55
votes
5 answers

Pandas: resample timeseries with groupby

Given the below pandas DataFrame: In [115]: times = pd.to_datetime(pd.Series(['2014-08-25 21:00:00','2014-08-25 21:04:00', '2014-08-25 22:07:00','2014-08-25 22:09:00'])) locations = ['HK', 'LDN',…
AshB
  • 575
  • 1
  • 5
  • 7
51
votes
9 answers

Converting a data frame to xts

I'm trying to convert a data frame to xts object using the as.xts()-method. Here is my input dataframe q: q t x 1 2006-01-01 00:00:00 1 2 2006-01-01 01:00:00 2 3 2006-01-01 02:00:00 3 str(q) 'data.frame': 10…
user442446
  • 989
  • 3
  • 10
  • 13
1
2 3
99 100