From the course: Python in Excel: Working with pandas DataFrames

Unlock the full course today

Join today to access over 23,300 courses taught by industry experts.

Data cleaning

Data cleaning

- [Instructor] Data is almost always in a messy state, so let's see how we can clean it by handling missing and duplicate data. As usual, in sale K1, we're turning the data into a data frame. Nothing special here, so let's jump right into the next cell below, which shows the usual first step in a data analysis calling the info method on a data frame. So let's run this cell by hitting control enter. This gives you an idea about the index and about the number of observations and data type for each column. Now may actually be a good time to mention that Pandas shows the data type object for strings. However, because the output is printed, it appears here on the diagnostics pane, far away from the Python cell. Printing also has the side effect of making the diagnostics pane show over and over again whenever the sheet is recalculated. That's why I tend to avoid printing in Python, in Excel, if anyhow possible. In this case, we can do the following. We can show the index here by turning it…

Contents