Notice: Undefined index: limited_lang in /home/beaczwhx/dggulaitutorial.com/wp-content/plugins/code-syntax-highlighter/inc/src/rendrer.php on line 297
Welcome to my new tutorial How To Create Dataframe In Python Using Pandas. In this tutorial, you will learn to create pandas dataframe in python.
In the previous tutorials, we have learned what is pandas and different data structures of pandas. Let’s check them first.
Pandas is very popular library for data science. DataFrame is a data structure of pandas. DataFrames are widely used in data science, machine learning, scientific computing, and many other data-intensive fields.
We will discuss in detail later your query – how do you create a data frame In Python. But in a short answer, you can create pandas dataframe in python using following syntax.
pandas.DataFrame( data, index, columns, dtype, copy)
For eg. –
import pandas as pd data = [420, 380, 390] #load data into a DataFrame object: df = pd.DataFrame(data) print(df)
So now before learning to create dataframe in different ways, first let’s give a quick glance on basics of dataframe.
Dataframe – An Important Pandas Data Structure
What is DataFrame in pandas?
- DataFrame is a very most important data structure of pandas.
- It is 2 dimensional data structure of pandas.
- In data frames data are stored in tabular format.
- Pandas DataFrame consists of three main components: the data, the index, and columns.
Features of dataframe
- Easier to use
- Faster
- powerful than tables or spreadsheets
- Its size is mutable
- Can Perform Arithmetic operations on rows and columns
- Potentially columns are of different types
Which method is used to make a DataFrame?
pandas.DataFrame() is used to make a pandas dataframe in python.
How To Create Dataframe In Python Using Pandas?
Creating data frame in python using pandas is very simple. So let’s see the syntax of creating dataframe in python.
Syntax
We can create dataframe by using following syntax.
pandas.DataFrame(data, index, columns, dtype, copy)
Let’s understand the parameters.
- data – It is used for taking data from which we have to create dataframe. Data can be in any form like ndarray, series, map, lists, dict, constants or other dataframes.
- index – It is used for resulting frame. If no index provided it will be default to RangeIndex.
- columns – It is used for column labels. The default columns values are RangeIndex(0, 1, 2, …, n).
- dtype – data type of each column. It’s default value is None.
- copy – It is used for copy data from inputs.
How do you create a data frame In Python?
We have seen syntax of creating dataframe now let’s create a simple dataframe. Run the following code to create dataframe.
import pandas as pd data = [8,9,5,4] df = pd.DataFrame(data) print("Dataframe Example") print(df)
- First of all we have imported pandas module as pd.
- Then created a data from which we have to create dataframe. This data contain a list.
- Then created dataframe using pd.DataFrame(data) and stored it in a variable df.
- Then printed the dataframe.
Output
Creating Dataframe With Custom Index and Columns
To create a dataframe with custom index and columns, we have to pass index and columns argument to the pandas.DataFrame(). So let’s write the code.
import pandas as pd data = [8,9,5,4] df = pd.DataFrame(data, index=[1,2,3,4], columns=['a']) print("Dataframe Example") print(df)
- Here we have passed index and columns values.
- Make sure to pass same range of index and columns as it is in your data. For eg. here in our list we have 4 values so we have passed 4 index values and have only one column so we have passed only one column value.
Output
So till now we have learned to create pandas dataframe in python. This was just a simple way to create a dataframe. Now we will learn different ways of creating pandas dataframe in python. So let’s learn them one by one.
Different Ways Of Creating Pandas Dataframe In Python
We can create pandas dataframe from dictionary, list, existing dataframe and many more. Let’s learn them practically with examples.
How To Create Pandas Empty DataFrame?
We can also create an dataframe in python. This is a very basic dataframe which can be created just writing 3 lines of codes. So first of all we will solve this query – How can you create an empty DataFrame in pandas?
Write following code for python create empty dataframe example.
import pandas as pd df = pd.DataFrame() print(df)
- To create an empty dataframe we don’t need to pass any data.
Output
Now let’s see an example of pandas create empty dataframe with column names.
import pandas as pd df = pd.DataFrame(columns=['a','b','c']) print(df)
- Here we have just added columns values. Let’s see it’s output.
Output
How To Create Pandas Dataframe From Dictionary?
Here we will create dataframe from simple dictionary i.e. dictionary with key and simple value like integer or string value.
import pandas as pd dict = { 'Mango' : 100, 'Apple' : 500, 'Banana' : 400 } df = pd.DataFrame(list(dict.items())) print(df)
- At first, we have imported pandas module.
- Then created a simple dictionary i.e. with keys and values.
- Then created a dataframe from a list of tuples of key value pair.
- And at last printed the dataframe.
Output
Know More : Different Ways To Create Pandas Dataframe From Dictionary
How To Create Pandas DataFrame from Dicts of series?
import pandas as pd s1 = pd.Series([100, 200, 300],index =['a', 'b', 'c']) s2 = pd.Series([10, 20, 30, 40],index =['a', 'b', 'c', 'd']) s3 = pd.Series(['Apples', 'Cars', 'Cats', 'Laptops'], index =['a', 'b', 'c', 'd']) dict = {'Numbers' : s1, 'ID' : s2, 'Name' :s3 } df = pd.DataFrame(dict) print(df)
- We have created 3 series with index values.
- Then created a dictionary of series.
- Now passed this dictionary of series to pd.DataFrame() to create dataframe. The resultant index is the union of all the series indexes passed.
- At last printed the dataframe.
import pandas as pd dict = {'Numbers' : pd.Series([100, 200, 300], index =['a', 'b', 'c']), 'Id' : pd.Series([10, 20, 30, 40], index =['a', 'b', 'c', 'd']), 'Name': pd.Series(['Apples', 'Cars', 'Cats', 'Laptops'], index =['a', 'b', 'c', 'd']),} df = pd.DataFrame(dict) print(df)
How To Create Pandas Dataframe From List?
Creating pandas dataframe from list is very simple. So Write and execute below code to create pandas dataframe from list.
import pandas as pd list = [4,5,7,3,5] df = pd.DataFrame(list) print(df)
- Here we have created a list then passed it to pd.DataFrame() as an argument to create dataframe.
- And at last printed the dataframe.
Output
Know More : Different Ways To Create Pandas Dataframe From List In Python
How To Create Pandas DataFrame From List of Lists?
We can also create pandas dataframe from list of lists. To do this, we have to create a list a lists and pass that as an argument to the DataFrame constructor. Let’s see code of creating pandas DataFrame from lists of lists.
import pandas as pd list_of_list = [['Python', 100], ['Data Science', 200], ['Machine Learning', 800]] df = pd.DataFrame(list_of_list) print(df)
Output
How To Create Pandas Dataframe From NumPy Array?
import pandas as pd import numpy as np numpy_array = np.array([['Python', 100], ['Data Science', 200], ['Machine Learning', 800]]) df = pd.DataFrame(numpy_array) print(df)
- First of all, we imported two modules pandas and numpy.
- Then created a numpy array using array() method of numpy. We have passed a nested list to create numpy array.
- Now we have passed this numpy array to the DataFrame constructor as a data argument.
- Then finally printed the dataframe.
Output
Creating Pandas Dataframe From NumPy Array With Custom Columns and Index
To create dataframe from numpy array, we have to pass index and column argument to DataFrame Constructor.
import pandas as pd import numpy as np numpy_array = np.array([['Python', 100], ['Data Science', 200], ['Machine Learning', 800]]) df = pd.DataFrame(numpy_array, index=[1,2,3], columns=['Books', 'Price']) print(df)
Output
How To Create Pandas DataFrame From CSV File?
To create pandas dataframe from csv file, we have to write following code.
import pandas as pd df = pd.read_csv("employee.csv") print(df)
- Pandas provide read_csv() method to create dataframe from csv file.
- We have an employee.csv file which we have passed to read_csv() method.
- Then printed dataframe.
Output
The first row of csv file is treated as the column in resulting dataframe. You can see here ID, Name, City and Designation were the first row in the csv file but in dataframe it became columns of dataframe.
Know More – Different Ways To Create Dataframe From CSV File In Python
How To Create New Dataframe From Existing Dataframe Pandas?
We can create new dataframe from existing dataframe. pandas.concat() method is used to concatenate data-frames both vertically or horizontally as required.
Let’s see an example of creating new dataframe from existing dataframe.
import pandas as pd list_of_list = [['Python', 100], ['Data Science', 200], ['Machine Learning', 800]] #creating first dataframe df1 = pd.DataFrame(list_of_list, columns=['Books','Price']) list = [100,800,900] #creating second dataframe df2 = pd.DataFrame(list, columns=['Pages']) #Concatenating two daframes df_joined = pd.concat([df1, df2], axis=1) print(df_joined)
Output
So guys, I am wrapping up this how to create dataframe in python using pandas tutorial. I hope you have found this tutorial very helpful. But still you have any doubt regarding this tutorial then feel free to ask your queries on comment section.