Questions tagged [pandas]

Pandas is a Python library for data manipulation and analysis, e.g. dataframes, multidimensional time series and cross-sectional datasets commonly found in statistics, experimental science results, econometrics, or finance. Pandas is one of the main data science libraries in Python.

pandas is a Python library for PAN-el DA-ta manipulation and analysis, i.e. multidimensional time series and cross-sectional data sets commonly found in statistics, experimental science results, econometrics, or finance. pandas is implemented primarily using NumPy and Cython; it is intended to be able to integrate very easily with NumPy-based scientific libraries, such as statsmodels.

To create a reproducible pandas example:

Main Features:

Data structures: for 1 and 2 dimensional labeled datasets (respectively Series and DataFrames). Some of their main features include:
Automatically aligning data and interpolation
Handling missing observations in calculations
Convenient slicing and reshaping ("reindexing") functions
Categorical data types
Provide 'group by' aggregation or transformation functionality
Tools for merging/joining together data sets
Simple matplotlib integration for plotting and graphing
Multi-Indexing providing structure to indices that allow for representation of an arbitrary number of dimensions.
Date tools: objects for expressing date offsets or generating date ranges; some functionality similar to scikits.timeseries. Dates can be aligned to a specific time zone and converted/compared at-will
Statistical models: convenient ordinary least squares and panel OLS implementations for in-sample or rolling time series / cross-sectional regressions. These will hopefully be the starting point for implementing models
Intelligent Cython offloading; complex computations are performed rapidly due to these optimizations.
Static and moving statistical tools: mean, standard deviation, correlation, covariance
Rich User Documentation, using Sphinx

Asking Questions:

Before asking the question, make sure you have gone through the 10 Minutes to pandas introduction. It covers all the basic functionality of pandas.
See this question on asking good questions: How to make good reproducible pandas examples
Please provide the version of pandas, NumPy, and platform details (if appropriate) in your questions

Answering Questions:

How can I effectively load data on Stack Overflow questions using pandas read_clipboard? (useful for copy pasting data from questions into your terminal as DataFrames)

Useful Canonicals:

Resources and Tutorials:

Books:

202712 questions

2756

votes

27 answers

How to iterate over rows in a DataFrame in Pandas

I have a DataFrame from Pandas: import pandas as pd inp = [{'c1':10, 'c2':100}, {'c1':11,'c2':110}, {'c1':12,'c2':120}] df = pd.DataFrame(inp) print df Output: c1 c2 0 10 100 1 11 110 2 12 120 Now I want to iterate over the rows of this…

python pandas dataframe

asked May 10 '13 at 07:04

Roman

97,757
149
317
426

2539

votes

11 answers

How to select rows from a DataFrame based on column values

How can I select rows from a DataFrame based on values in some column in Pandas? In SQL, I would use: SELECT * FROM table WHERE colume_name = some_value I tried to look at Pandas' documentation, but I did not immediately find the answer.

python pandas dataframe

asked Jun 12 '13 at 17:42

szli

28,045
8
26
37

2216

votes

29 answers

Renaming columns in Pandas

I have a DataFrame using Pandas and column labels that I need to edit to replace the original column labels. I'd like to change the column names in a DataFrame A where the original column names are: ['$a', '$b', '$c', '$d', '$e'] to ['a', 'b', 'c',…

python pandas replace dataframe rename

asked Jul 05 '12 at 14:21

user1504276

22,263
3
12
7

1647

votes

17 answers

Delete column from pandas DataFrame

When deleting a column in a DataFrame I use: del df['column_name'] And this works great. Why can't I use the following? del df.column_name Since it is possible to access the column/Series as df.column_name, I expected this to work.

python pandas dataframe

asked Nov 16 '12 at 06:26

John

32,659
27
74
102

1377

votes

21 answers

Selecting multiple columns in a Pandas dataframe

I have data in different columns, but I don't know how to extract it to save it in another variable. index a b c 1 2 3 4 2 3 4 5 How do I select 'a', 'b' and save it in to df1? I tried df1 = df['a':'b'] df1 = df.ix[:,…

python pandas dataframe select

asked Jul 01 '12 at 21:03

user1234440

18,511
17
51
88

1275

votes

15 answers

How do I get the row count of a Pandas DataFrame?

I'm trying to get the number of rows of dataframe df with Pandas, and here is my code. Method 1: total_rows = df.count print total_rows + 1 Method 2: total_rows = df['First_columnn_label'].count print total_rows + 1 Both the code snippets give me…

python pandas dataframe

asked Apr 11 '13 at 08:14

yemu

18,591
8
25
29

1155

votes

20 answers

Get list from pandas DataFrame column headers

I want to get a list of the column headers from a pandas DataFrame. The DataFrame will come from user input so I won't know how many columns there will be or what they will be called. For example, if I'm given a DataFrame like this: >>>…

python pandas dataframe

asked Oct 20 '13 at 21:18

natsuki_2002

19,933
18
42
49

1133

votes

28 answers

Adding new column to existing DataFrame in Python pandas

I have the following indexed DataFrame with named columns and rows not- continuous numbers: a b c d 2 0.671399 0.101208 -0.181532 0.241273 3 0.446172 -0.243316 0.051767 1.577318 5 0.614758 0.075793 -0.451460…

python pandas dataframe chained-assignment

asked Sep 23 '12 at 19:00

tomasz74

13,747
10
32
49

1117

votes

38 answers

How to change the order of DataFrame columns?

I have the following DataFrame (df): import numpy as np import pandas as pd df = pd.DataFrame(np.random.rand(10, 5)) I add more column(s) by assignment: df['mean'] = df.mean(1) How can I move the column mean to the front, i.e. set it as first…

python pandas dataframe

asked Oct 30 '12 at 22:22

Timmie

11,359
3
12
7

1116

votes

30 answers

Create pandas Dataframe by appending one row at a time

I understand that pandas is designed to load fully populated DataFrame but I need to create an empty DataFrame then add rows, one by one. What is the best way to do this ? I successfully created an empty DataFrame with : res =…

python pandas dataframe append

asked May 23 '12 at 08:12

PhE

12,544
3
18
18

1086

votes

15 answers

"Large data" workflows using pandas

I have tried to puzzle out an answer to this question for many months while learning pandas. I use SAS for my day-to-day work and it is great for it's out-of-core support. However, SAS is horrible as a piece of software for numerous other…

python mongodb pandas hdf5 large-data

asked Jan 10 '13 at 16:20

Zelazny7

35,102
16
63
76

1037

votes

10 answers

Change column type in pandas

I want to convert a table, represented as a list of lists, into a Pandas DataFrame. As an extremely simplified example: a = [['a', '1.2', '4.2'], ['b', '70', '0.03'], ['x', '5', '0']] df = pd.DataFrame(a) What is the best way to convert the columns…

python pandas dataframe types casting

asked Apr 08 '13 at 23:53

user1642513

983

votes

12 answers

How to drop rows of Pandas DataFrame whose value in a certain column is NaN

I have this DataFrame and want only the records whose EPS column is not NaN: >>> df STK_ID EPS cash STK_ID RPT_Date 601166 20111231 601166 NaN NaN 600036 20111231 600036 NaN 12 600016 20111231 600016 …

python pandas dataframe nan

asked Nov 16 '12 at 09:17

bigbug

40,984
35
71
92

905

votes

3 answers

Use a list of values to select rows from a pandas dataframe

Lets say I have the following pandas dataframe: df = DataFrame({'A' : [5,6,3,4], 'B' : [1,2,3, 5]}) df A B 0 5 1 1 6 2 2 3 3 3 4 5 I can subset based on a specific value: x = df[df['A'] == 3] x A B 2 3 …

python pandas dataframe

asked Aug 23 '12 at 16:31

zach

22,141
16
57
86

868

votes

15 answers

How to deal with SettingWithCopyWarning in Pandas

Background I just upgraded my Pandas from 0.11 to 0.13.0rc1. Now, the application is popping out many new warnings. One of them like this: E:\FinReporter\FM_EXT.py:449: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a…

python pandas dataframe chained-assignment

asked Dec 17 '13 at 03:48

bigbug

40,984
35
71
92

2 3

…

99 100 Next