How To Read And Write CSV File In Python Pandas


Notice: Undefined index: limited_lang in /home/beaczwhx/dggulaitutorial.com/wp-content/plugins/code-syntax-highlighter/inc/src/rendrer.php on line 297

Hello coders, today we will learn reading and writing CSV file in python using pandas. In this Read and write CSV file in python pandas tutorial, you will learn about CSV file and different ways of reading and writing csv file using pandas module.

CSV file is the most common file format of exchanging data.  In data science most of the time we have to deal with CSV files and thus pandas plays very important role in working with CSV files. Pandas is a most popular python library used in data science.

After going through this Read And Write CSV File In Python Pandas tutorial, you will learn following things –

  • What is CSV file
  • How to read csv file using pandas
  • How to write csv file using pandas
  • Different ways to read and write csv files

So without any delay let’s gets started write and read csv file in python pandas tutorial. I hope you will find this tutorial very helpful.

Working With CSV File – A Quick Recap Of CSV File

Before learning to read and write csv file in pandas you have to learn some basic things of pandas like how to install pandas or how to use pandas. So to learn more, visit this tutorial –  What Is Pandas In Python .

CSV files are often used in data science, so let’s learn some basics of CSV files.

What Is CSV File ?

CSV(Comma Separated Values) file is a common, easiest and popular way to store tabular data. The values are separated using comma as the delimiter.  We can also use semicolon(;), tab(\t), and colon(:) as the delimiter.

CSV file is a very compact and concise way of representing tabular data. The values are separated by commas which form the individual column. Conversely, each line of values are seen as a row and a new line indicates a new row.

Representation Of A CSV File

So now you might think how a csv file look like. So to know that let’s take an sample of a csv file format.

ph,Hardness,Solids,Chloramines,Sulfate,Conductivity
204.8904554713363,20791.318980747026,7.300211873184757,368.51644134980336,564.3086541722439,10.379
3.71608007538699,129.42292051494425,18630.057857970347,6.635245883862,592.8853591348523,15.18001
8.099124189298397,224.23625939355776,19909.541732292393,9.275883602694089,418.6062130644815,650.99983
8.316765884214679,214.37339408562252,22018.417440775294,8.05933237743854,356.88613564305666,567.3377

You can see in the above csv file, values are separated with comma delimiter.

Now you have a quick recap of CSV file so let’s jump into our main query i.e.- how do you read and write csv file in Python using pandas?

How To Read And Write CSV File In Python Pandas

Python provides csv built-in module to work with csv file but it is not handy when we work with huge amount of csv files. In data science we have to work with large datasets and in such case we can’t rely on csv module. So python has a third party library pandas which is very useful to work with lots of csv files.

So in this tutorial, we will learn to read and write in csv file using pandas library in python. Let’s first learn to read csv file in python pandas.

Here we will use water_potability.csv file to illustrate this tutorial. You can download it from here. And I am using Jupyter notebook for coding, so If you want to learn how to use Jupyter notebook then learn from here.

How To Read CSV File In Python Pandas?

Reading csv file in python using pandas is very easy. Just write three lines of code and your work is done.

Pandas provides read_csv() method to read csv file. read_csv() method opens, analyzes, and reads the CSV file provided and store the data in a dataframe.

import pandas as pd

df = pd.read_csv("water_potability.csv")

df

What We Did?

  • First of all we have imported pandas module and used an alias of pandas i.e. pd.
  • Then called read_csv() method to read the csv file. We have passed the name of the csv file to read_csv() method.
  • Here, we have just passed the name of the csv file because our csv file is in same directory. But if your csv file is in different location then you must have to include full path of that csv file.
  • And at the end, we have printed the dataframe.

So the coding part is done and now the time is to check the output. So the output is here.

read and write CSV file in python pandas

In the above output, you can see pandas recognized that the first line of the CSV contained column names, and used them automatically.

Reading CSV File With Custom Headers

read_csv() method read the first rows of the csv file as column names but what if we want to add our own custom column names. So to add custom column names we have to specify the names attribute of read_csv() method.

So let’s see how to do that.

Write the following code snippet and execute it.

import pandas as pd 

df = pd.read_csv("water_potability.csv", names=["w_ph","w_Hardness","w_Solids","w_Chloramines","w_Sulfate","w_Conductivity","w_Organic_carbon","w_Trihalomethanes","w_Turbidity","w_Potability"
                                               ]) 

df
  • We just added names attribute as an argument in the read_csv() method.

Now let’s see how it’s output looks like.

read and write CSV file in python pandas

So now you can see our custom header is added into our dataframe. But the first row of csv file which was the column of the previous dataframe, now treated as the first row in this dataframe.

So now let’s eliminate this condition.

Removing Extra Header

We can remove extra headers using skiprows argument of read_csv() method.

So write the following code to skip the first row of the csv file.

import pandas as pd 

df = pd.read_csv("water_potability.csv", names=["w_ph","w_Hardness","w_Solids","w_Chloramines","w_Sulfate","w_Conductivity","w_Organic_carbon","w_Trihalomethanes","w_Turbidity","w_Potability"
                                               ], skiprows=[0]) 

df

Let’s check it’s output.

read and write CSV file in python pandas

We can also skip multiple rows simultaneously. Lets understand it with an example.

Write the following code and run it.

import pandas as pd 

df = pd.read_csv("water_potability.csv", skiprows=[1,2,4]) 

df
  • To skip multiple rows from dataframe, we specify the list of rows to the skiprows argument.
  • For example, here we are skipping first, second and forth row of the csv file.

Output

Now the output is following.

read and write CSV file in python pandas

Reading CSV File Without Headers

To read csv file without headers, we have to use header argument. See the code below.

import pandas as pd 

df = pd.read_csv("water_potability.csv", header=None, skiprows=[0]) 

df
  • We have passed None value to the attribute header.
  • And also passed skiprows attribute to skip reading first row of csv file.

Output

Reading Particular Rows

Some time we encounter very big csv file and we have to read only just few rows. In such case we use nrows attribute of read_csv() method.

So let’s see how to do that.

import pandas as pd 

df = pd.read_csv("water_potability.csv", nrows=5) 

df
  • We passed nrows attribute as an argument to the read_csv() method to read only particular rows from csv file.
  • We specified value of nrows is 5 because we want to read just 5 rows of the csv file.

Output

You can see in the above output, only 5 rows is printed.

So guys this was all about read csv file in python pandas. I hope you found the answer of your query – How do I read a csv file using pandas in python?

Now let’s continue Read And Write CSV File In Python Pandas tutorial. In the upcoming part you will learn How do I write a csv file in Python using pandas library. So let’s get started without any delay.

How To Write CSV File In Python Pandas?

Pandas has to_csv() method to write csv file. dataframe.to_csv() write object to a comma-separated values (csv) file.

Let’s understand it with an example. Write the following code and run it.

import pandas as pd

df1 = pd.DataFrame({'name': ['John', 'Mathew'],
                   'city': ['Newyork', 'London'],
                   'designation': ['HR', 'MD']})

emp= df1.to_csv("employee.csv")

What We Did ?

  • We have created a dataframe which contain information of employees.
  • Then called to_csv() method to write this dataframe into a csv file. And named the csv file employee.csv.

Output

Now a csv file is created in the current directory where our program is running. If you want to save the csv file into another directory then you have to pass the full path of that directory where you want to save.

Now lets check the output. So here is the csv file.

Writing CSV File Without Index

In the above example, you can see the index that are 0 and 1. Now if we want to exclude the index values then we can do it by using index attribute of to_csv() method.

Let’s see it practically. Write the following code to do it.

import pandas as pd

df1 = pd.DataFrame({'name': ['John', 'Mathew'],
                   'city': ['Newyork', 'London'],
                   'designation': ['HR', 'MD']})

emp= df1.to_csv("employee.csv", index=False)
  • WE have set index value False which will not specify the indexes in the csv file.

Output

Let’s check it’s output, how it looks like.

Skipping Columns While Writing CSV File

If we want to skip some columns while writing csv file then we can do that by using column attribute of to_csv() method.

import pandas as pd

df1 = pd.DataFrame({'name': ['John', 'Mathew'],
                   'city': ['Newyork', 'London'],
                   'designation': ['HR', 'MD']})

emp= df1.to_csv("employee.csv", columns=["name","designation"])

What We Did?

  • We passed the attribute columns as argument to the to_csv() method.
  • And specified the column names which we want to include into the csv file.

Output

Here you can see the csv file has only name and designation columns. The city column is skipped.

Skipping Headers While Writing CSV File

Sometimes we might have to write csv file without any header. We can skip the headers while writing csv file by using header argument.

We just have to specify the header value to False. Lets see the code.

import pandas as pd

df1 = pd.DataFrame({'name': ['John', 'Mathew'],
                   'city': ['Newyork', 'London'],
                   'designation': ['HR', 'MD']})

emp= df1.to_csv("employee.csv", header=False)

Output

You can see this csv file has no columns.

Customizing Headers While Writing CSV File

We can also customize the existing headers of the csv file. To do that we will pass a list of column names to the header attribute of to_csv() file.

Write the following code and run it.

import pandas as pd

df1 = pd.DataFrame({'name': ['John', 'Mathew'],
                   'city': ['Newyork', 'London'],
                   'designation': ['HR', 'MD']})

emp= df1.to_csv("employee.csv", header=['e_name','e_city','e_designation'])
  • Here we have passed our own header. This will write custom header to the csv file instead of writing header of the dataframe.

Output

You can see the header of csv file is changed now.

So guys this was all about read and write CSV file in python pandas tutorial. I hope you have understand this tutorial very well. But if you have any query regarding this post then feel free to ask your questions in comment section.

People Also Finding These Tutorials On Dggul AI Tutorial Very Helpful

Leave a Comment