From the course: Power BI Data Methods

Filtering or removing duplicates - Power BI Tutorial

From the course: Power BI Data Methods

Filtering or removing duplicates

- [Instructor] We can apply filtering on our data tables for conditions like a text string contains learn, a number is greater than zero and less than or equal to 1,000, and a date occurs between January 1st and June 30th. In our Daily Weather table, let's filter Column 3, which contains the data types, so that we see the temperatures. If we expand the dropdown, we see that it only loads the values for the first 1,000 rows. Now, we can see more, and there won't be more than 1,000 data types, but it will look at more than 1,000 of the top rows. We can then filter these conditions to only take the maximum temperature, TMAX, its converse minimum counterpart, and the average temperature, TAVG. Now, when we look at what the Filtered Rows step does, we see that it takes the Table.SelectRows function and applies it to the table so that each of the Column 3 values has to equal one of these three data types. We can also use conditions for filtering. Let's do so on the NOAA station's table. Let's first filter the station ID. We're going to look for 23174. Now, we can say it contains. Now, this returns three potential matches. The one that we're interested is in California. However, one problem we see here, which is why filtering data to segment it and slice it to understand what's going on is an important part of the process, is because we're picking up the Los Angeles airport instead of Santa Barbara. Now, let's get what the Santa Barbara airport is. We can remove the filter step, and this time we'll apply the filter for the text fields saying contains. And we'll say Santa Barbara. Now, this returns an empty table, and the reason is because capitalization is actually important here, so it's not going to take these lowercase letters and convert them. Instead, I have to explicitly put those into the formula. So I ran it on the station that I want to use, the Santa Barbara Municipal Airport. So once I have the station matching, I'm going to copy the station ID and update the Daily Weather query to make sure that it matches the location that I want to use. Notice how when we update the data, when we refresh it to use the other station ID, the rest of the steps repeat and we see the filtered data step run on the refreshed data table. Let's also filter by dates. So we need to make sure, in order to do this, that we have a date field. So if I change the dates in Column 2, notice we see an error, and the reason is because we already changed the dates earlier. Instead, we have to do it in an earlier step. And we'll choose to insert by replacing the current step. Now, we want to rename this field, Date, to avoid confusion. And I am going to say that I want the dates to be after the 1st of January, 2022. Now, when we open this up, we actually see more options. I'm going to choose after or equal to, so it includes January 1st. Lastly, let's apply filters for the latitude and longitude coordinates so we can see the stations that are around the Santa Barbara airport. We're going to remove the Filtered Rows step when we filter the station name and instead filter it by latitude and longitude. Our latitude is going to be greater than or equal to 34 and less than or equal to 35, and our longitude, we'll again use the Between option, is going to be greater than or equal to 119. You actually want to switch these because they go the other way. So we see our Santa Barbara airport, and we also see all the stations nearby. Now, filtering is a very useful transformation function, and we'll continue to explore it throughout this course.

Contents