How To Create Dataframe In Python Using Pandas


Notice: Undefined index: limited_lang in /home/beaczwhx/dggulaitutorial.com/wp-content/plugins/code-syntax-highlighter/inc/src/rendrer.php on line 297

Welcome to my new tutorial How To Create Dataframe In Python Using Pandas. In this tutorial, you will learn to create pandas dataframe in python.

In the previous tutorials, we have learned what is pandas and different data structures of pandas. Let’s check them first.

Pandas is very popular library for data science. DataFrame is a data structure of pandas. DataFrames are widely used in data science, machine learning, scientific computing, and many other data-intensive fields.

We will discuss in detail later your query – how do you create a data frame In Python. But in a short answer, you can create pandas dataframe in python using following syntax.

pandas.DataFrame( data, index, columns, dtype, copy)

For eg. –

import pandas as pd

data = [420, 380, 390]
  
#load data into a DataFrame object:
df = pd.DataFrame(data)

print(df)

So now before learning to create dataframe in different ways, first let’s give a quick glance on basics of dataframe.

Dataframe – An Important Pandas Data Structure

What is DataFrame in pandas?

  • DataFrame is a very most important data structure of pandas.
  • It is 2 dimensional data structure of pandas.
  • In data frames data are stored in tabular format.
  • Pandas DataFrame consists of three main components: the data, the index, and columns.

Features of dataframe

  • Easier to use
  • Faster
  • powerful than tables or spreadsheets
  • Its size is mutable
  • Can Perform Arithmetic operations on rows and columns
  • Potentially columns are of different types

Which method is used to make a DataFrame?

pandas.DataFrame() is used to make a pandas dataframe in python.

How To Create Dataframe In Python Using Pandas?

Creating data frame in python using pandas is very simple. So let’s see the syntax of creating dataframe in python.

Syntax

We can create dataframe by using following syntax.

pandas.DataFrame(data, index, columns, dtype, copy)

Let’s understand the parameters.

  • data – It is used for taking data from which we have to create dataframe. Data can be in any form like ndarray, series, map, lists, dict, constants or other dataframes.
  • index – It is used for resulting frame. If no index provided it will be default to RangeIndex.
  • columns – It is used for column labels. The default columns values are RangeIndex(0, 1, 2, …, n).
  • dtype – data type of each column. It’s default value is None.
  • copy –  It is used for copy data from inputs.

How do you create a data frame In Python?

We have seen syntax of creating dataframe now let’s create a simple dataframe. Run the following code to create dataframe.

import pandas as pd
data = [8,9,5,4]
df = pd.DataFrame(data)
print("Dataframe Example")
print(df)
  • First of all we have imported pandas module as pd.
  • Then created a data from which we have to create dataframe. This data contain a list.
  • Then created dataframe using pd.DataFrame(data) and stored it in a variable df.
  • Then printed the dataframe.

Output

How To Create Dataframe In Python Using Pandas

In the above output, you can see the our dataframe has default index and columns because we have not passed index and columns argument. Let’s create a dataframe with columns and index.

Creating Dataframe With Custom Index and Columns

To create a dataframe with custom index and columns, we have to pass index and columns argument to the pandas.DataFrame(). So let’s write the code.

import pandas as pd 
data = [8,9,5,4]
df = pd.DataFrame(data, index=[1,2,3,4], columns=['a'])
print("Dataframe Example")
print(df)
  • Here we have passed index and columns values.
  • Make sure to pass same range of index and columns as it is in your data. For eg. here in our list we have 4 values so we have passed 4 index values and have only one column so we have passed only one column value.

Output

How To Create Dataframe In Python Using Pandas

So till now we have learned to create pandas dataframe in python. This was just a simple way to create a dataframe. Now we will learn different ways of creating pandas dataframe in python. So let’s learn them one by one.

Different Ways Of Creating Pandas Dataframe In Python

We can create pandas dataframe from dictionary, list, existing dataframe and many more. Let’s learn them practically with examples.

How To Create Pandas Empty DataFrame?

We can also create an dataframe in python. This is a very basic dataframe which can be created just writing 3 lines of codes. So first of all we will solve this query – How can you create an empty DataFrame in pandas?

Write following code for python create empty dataframe example.

import pandas as pd 
df = pd.DataFrame()
print(df)
  • To create an empty dataframe we don’t need to pass any data.

Output

How To Create Dataframe In Python Using Pandas

Now let’s see an example of pandas create empty dataframe with column names.

import pandas as pd 
df = pd.DataFrame(columns=['a','b','c'])
print(df)
  • Here we have just added columns values. Let’s see it’s output.

Output

How To Create Dataframe In Python Using Pandas

How To Create Pandas Dataframe From Dictionary?

Here we will create dataframe from simple dictionary i.e. dictionary with key and simple value like integer or string value.

import pandas as pd 

dict = {
    'Mango' : 100,
    'Apple' : 500,
    'Banana' : 400
    }

df = pd.DataFrame(list(dict.items()))

print(df)
  • At first, we have imported pandas module.
  • Then created a simple dictionary i.e. with keys and values.
  • Then created  a dataframe from a list of tuples of key value pair.
  • And at last printed the dataframe.

Output

How To Create Dataframe In Python Using Pandas

Know More : Different Ways To Create Pandas Dataframe From Dictionary

How To Create Pandas DataFrame from Dicts of series?

Creating DataFrame from Dicts of series can be done in following ways.
import pandas as pd 

s1 = pd.Series([100, 200, 300],index =['a', 'b', 'c'])
s2 = pd.Series([10, 20, 30, 40],index =['a', 'b', 'c', 'd'])
s3 = pd.Series(['Apples', 'Cars', 'Cats', 'Laptops'],
                        index =['a', 'b', 'c', 'd'])

dict =  {'Numbers' : s1, 
         'ID' : s2,
         'Name' :s3
        }             
        
df = pd.DataFrame(dict)
print(df)
  • We have created 3 series with index values.
  • Then created a dictionary of series.
  • Now passed this dictionary of series to pd.DataFrame() to create dataframe. The resultant index is the union of all the series indexes passed.
  • At last printed the dataframe.
We can also write above code in following way.
import pandas as pd 

dict =  {'Numbers' : pd.Series([100, 200, 300],
                       index =['a', 'b', 'c']),
      'Id' : pd.Series([10, 20, 30, 40],
                        index =['a', 'b', 'c', 'd']),
        'Name': pd.Series(['Apples', 'Cars', 'Cats', 'Laptops'],
                        index =['a', 'b', 'c', 'd']),}

df = pd.DataFrame(dict)

print(df)
Output
You can see in the the first series, there is no label ‘d’ passed, but in the resultant dataframe, for the d label, NaN is appended with NaN.

How To Create Pandas Dataframe From List?

Creating pandas dataframe from list is very simple. So Write and execute below code to create pandas dataframe from list.

import pandas as pd 

list = [4,5,7,3,5]

df = pd.DataFrame(list)

print(df)
  • Here we have created a list then passed it to pd.DataFrame() as an argument to create dataframe.
  • And at last printed the dataframe.

Output

Know More : Different Ways To Create Pandas Dataframe From List In Python

How To Create Pandas DataFrame From List of Lists?

We can also create pandas dataframe from list of lists. To do this, we have to create a list a lists and pass that as an argument to the DataFrame constructor. Let’s see code of creating pandas DataFrame from lists of lists.

import pandas as pd 

list_of_list = [['Python', 100], ['Data Science', 200], ['Machine Learning', 800]]

df = pd.DataFrame(list_of_list)

print(df)

Output

How To Create Pandas Dataframe From NumPy Array?

Now let’s learn creating pandas DataFrame from numpy array. So write the following code to do this.
import pandas as pd 
import numpy as np

numpy_array = np.array([['Python', 100], ['Data Science', 200], ['Machine Learning', 800]])

df = pd.DataFrame(numpy_array)

print(df)
  • First of all, we imported two modules pandas and numpy.
  • Then created a numpy array using array() method of numpy. We have passed a nested list to create numpy array.
  • Now we have passed this numpy array to the DataFrame constructor as a data argument.
  • Then finally printed the dataframe.

Output

Creating Pandas Dataframe From NumPy Array With Custom Columns and Index

To create dataframe from numpy array, we have to pass index and column argument to DataFrame Constructor.

import pandas as pd 
import numpy as np

numpy_array = np.array([['Python', 100], ['Data Science', 200], ['Machine Learning', 800]])

df = pd.DataFrame(numpy_array, index=[1,2,3], columns=['Books', 'Price'])

print(df)

Output

How To Create Pandas DataFrame From CSV File?

To create pandas dataframe from csv file, we have to write following code.

import pandas as pd 

df = pd.read_csv("employee.csv") 

print(df)
  • Pandas provide read_csv() method to create dataframe from csv file.
  • We have an employee.csv file which we have passed to read_csv() method.
  • Then printed dataframe.

Output

The first row of csv file is treated as the column in resulting dataframe. You can see here ID, Name, City and Designation were the first row in the csv file but in dataframe it became columns of dataframe.

Know More – Different Ways To Create Dataframe From CSV File In Python

How To Create New Dataframe From Existing Dataframe Pandas?

We can create new dataframe from existing dataframe. pandas.concat() method is used to concatenate data-frames both vertically or horizontally as required.

Let’s see an example of creating new dataframe from existing dataframe.

import pandas as pd 

list_of_list = [['Python', 100], ['Data Science', 200], ['Machine Learning', 800]]

#creating first dataframe
df1 = pd.DataFrame(list_of_list, columns=['Books','Price'])


list = [100,800,900]

#creating second dataframe 
df2 = pd.DataFrame(list, columns=['Pages'])


#Concatenating two daframes
df_joined = pd.concat([df1, df2], axis=1)


print(df_joined)

Output

So guys, I am wrapping up this how to create dataframe in python using pandas tutorial. I hope you have found this tutorial very helpful. But still you have any doubt regarding this tutorial then feel free to ask your queries on comment section.

People Also Reading….. 

Leave a Comment