6 ways to select columns from pandas DataFrame

Select column using column name with "." operator or []. Get all column names using columns method or using info() method. Describe the column statistics using describe() method. Select particular value in a column

Published

Updated

Read time 7 min read

Reviewed byDeepak Prasad

6 ways to select columns from pandas DataFrame

Different methods to select columns in pandas DataFrame

In this tutorial we will discuss how to select single columns using the following methods:

  • Select column using column name with "." operator
  • Select column using column name with []
  • Get all column names using columns method
  • Get all the columns information using info() method
  • Describe the column statistics using describe() method
  • Selectparticular value in a column

Create pandas DataFrame with example data

DataFrame is a data structure used to store the data in two dimensional format. It is similar to table that stores the data in rows and columns. Rows represents the records/ tuples and columns refers to the attributes.

We can create the DataFrame by using**pandas.DataFrame()**method.

Syntax:

python
pandas.DataFrame(input_data,columns,index)

Parameters:

It will take mainly three parameters

  1. input_data is represents a list of data
  2. columnsrepresent the columns names for the data
  3. indexrepresent the row numbers/values

We can also create a DataFrame using dictionary by skipping columns and indices.

Let’s see an example.

Example:

Python Program to create a dataframe for market data from a dictionary of food items by specifying the column names.

python
#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

#display the dataframe
print(dataframe)

Output:

            id            name    cost  quantity
item-1  foo-23  ground-nut oil  567.00         1
item-2  foo-13         almonds  562.56         2
item-3  foo-02           flour   67.00         3
item-4  foo-31         cereals   76.09         2

Method 1 : Select column using column name with "." operator

In this method we are going to select the columns using . operator with dataframe column name

It will display the column name along with rows present in the column

Syntax:

python
dataframe.column
Output

where,

  1. dataframe is the input dataframe
  2. column is the column name

Example 1: In this example we are going to select id and name column

python
#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

#display id column from the dataframe
print(dataframe.id)

print()

#display name column from the dataframe
print(dataframe.name)

Output:

item-1    foo-23
item-2    foo-13
item-3    foo-02
item-4    foo-31
Name: id, dtype: object

item-1    ground-nut oil
item-2           almonds
item-3             flour
item-4           cereals
Name: name, dtype: object

Example 2: In this example we are going to select cost and quantity column

python
#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

#display cost column from the dataframe
print(dataframe.cost)

print()

#display quantity column from the dataframe
print(dataframe.quantity)

Output:

item-1    567.00
item-2    562.56
item-3     67.00
item-4     76.09
Name: cost, dtype: float64

item-1    1
item-2    2
item-3    3
item-4    2
Name: quantity, dtype: int64

Method 2 : Select column using column name with []

In this method we are going to select the columns using [] with dataframe column name

It will display the column name along with rows present in the column

Syntax:

python
dataframe.['column']
Output

where,

  1. dataframe is the input dataframe
  2. column is the column name

Example 1: In this example we are going to select id and name column

python
#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

#display id column from the dataframe
print(dataframe['id'])

print()

#display name column from the dataframe
print(dataframe['name'])

Output:

item-1    foo-23
item-2    foo-13
item-3    foo-02
item-4    foo-31
Name: id, dtype: object

item-1    ground-nut oil
item-2           almonds
item-3             flour
item-4           cereals
Name: name, dtype: object

Example 2: In this example we are going to select cost and quantity column

python
#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

#display cost column from the dataframe
print(dataframe['cost'])

print()

#display quantity column from the dataframe
print(dataframe['quantity'])

Output:

item-1    567.00
item-2    562.56
item-3     67.00
item-4     76.09
Name: cost, dtype: float64

item-1    1
item-2    2
item-3    3
item-4    2
Name: quantity, dtype: int64

Method 3 : Get all column names using columns method

In this method we are going to select only the name of all columns using columns method

Syntax:

python
dataframe.columns
Output

where, dataframe is the input dataframe

Example:

python
#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

#display all columns 
print(dataframe.columns)

Output:

Index(['id', 'name', 'cost', 'quantity'], dtype='object')

Method 4 : Get all the columns information using info() method

We will get the column data types , total number of Non - null values in each column from the dataframe using info() method.

Syntax:

python
dataframe.info()
Output

Example:

python
#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

#display all columns information
print(dataframe.info())

Output:

<class 'pandas.core.frame.DataFrame'>
Index: 4 entries, item-1 to item-4
Data columns (total 4 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   id        4 non-null      object 
 1   name      4 non-null      object 
 2   cost      4 non-null      float64
 3   quantity  4 non-null      int64  
dtypes: float64(1), int64(1), object(2)
memory usage: 160.0+ bytes
None

Method 5 : Describe the column statistics using describe() method

This method will return count, minimum value, maximum value, standard deviation, etc from all the columns.

Syntax:

python
dataframe.describe()
Output

Example: Get the statistics from the above dataframe columns

python
#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

#display all columns statistics information
print(dataframe.describe())

Output:

             cost  quantity
count    4.000000  4.000000
mean   318.162500  2.000000
std    284.799307  0.816497
min     67.000000  1.000000
25%     73.817500  1.750000
50%    319.325000  2.000000
75%    563.670000  2.250000
max    567.000000  3.000000

Method 6 : Select particular value in a column

Here , we are going to select particular value in a column using above methods. Note - value position starts from 0. using this position we can select value from the selected column.

Syntax:

python
dataframe['column'][position]
Output

where,

  1. column is the column name
  2. position refers to the value position

Example:

python
#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

#display cost column first value from the dataframe
print(dataframe['cost'][0])

#display quantity column second value  from the dataframe
print(dataframe['quantity'][1])

#display id column first value from the dataframe
print(dataframe['id'][0])

#display name column second value  from the dataframe
print(dataframe['name'][1])

#display id column third value from the dataframe
print(dataframe['id'][2])

Output:

567.0
2
foo-23
almonds
foo-02

Example: We can also get using . operator

python
#import the module
import pandas

#consider the food data
food_input={'id':['foo-23','foo-13','foo-02','foo-31'],
                  'name':['ground-nut oil','almonds','flour','cereals'],
                  'cost':[567.00,562.56,67.00,76.09],
                  'quantity':[1,2,3,2]}

#pass this food to the dataframe by specifying rows 
dataframe=pandas.DataFrame(food_input,index = ['item-1', 'item-2', 'item-3', 'item-4'])

#display cost column first value from the dataframe
print(dataframe.cost[0])

#display quantity column second value  from the dataframe
print(dataframe.quantity[1])

#display id column first value from the dataframe
print(dataframe.id[0])

#display name column second value  from the dataframe
print(dataframe.name[1])

#display id column third value from the dataframe
print(dataframe.id[2])

Output:

567.0
2
foo-23
almonds
foo-02

Summary

In this tutorial we discussed how to select particular columns from the dataframe and we also discussed how to get the column names with the information like data types , statistics etc. Finally, we have also seen how to get particular value from the selected column using previous methods.


Further Reading

Pandas Get columns

Deepak Prasad

R&D Engineer

Founder of GoLinuxCloud with more than 15 years of expertise in Linux, Python, Go, Laravel, DevOps, Kubernetes, Git, Shell scripting, OpenShift, AWS, Networking, and Security. With extensive …