When working with large datasets, you rarely need all the data at once.
π Thatβs where data filtering comes in.
Filtering allows you to:
- Extract specific rows
- Apply conditions
- Focus only on relevant data
This is one of the most frequently used operations in Pandas.
π What is Data Filtering?
Data filtering means selecting a subset of data based on conditions.
Think of it like:
- Applying filters in Excel
- Writing SQL
WHEREclauses
π Basic Filtering
π Filter Rows Based on Condition
df[df["Age"] > 23]
π Returns only rows where age is greater than 23.
π Filter with Equal Condition
df[df["Name"] == "Sagar"]
π Multiple Conditions
AND Condition (&)
df[(df["Age"] > 22) & (df["City"] == "Delhi")]
OR Condition (|)
df[(df["Age"] > 24) | (df["City"] == "Mumbai")]
NOT Condition (~)
df[~(df["City"] == "Delhi")]
π― Filtering Specific Columns
df[["Name", "Age"]]
π Combine with filtering:
df[df["Age"] > 23][["Name", "Age"]]
π§ Using loc and iloc
π Using loc (Label-based)
df.loc[df["Age"] > 23, ["Name", "City"]]
π Using iloc (Position-based)
df.iloc[0:2]
π Advanced Filtering
πΉ Using isin()
df[df["City"].isin(["Delhi", "Mumbai"])]
πΉ Using between()
df[df["Age"].between(22, 25)]
πΉ Using str.contains()
df[df["Name"].str.contains("a")]
π Useful for text filtering.
β‘ Real-World Example
import pandas as pddf = pd.read_csv("employees.csv")# Employees with high salary
high_salary = df[df["Salary"] > 50000]# Employees from IT department
it_employees = df[df["Department"] == "IT"]# Combined filter
filtered = df[(df["Salary"] > 50000) & (df["Department"] == "IT")]print(filtered)
π Performance Tip
Use .query() for cleaner syntax:
df.query("Age > 23 and City == 'Delhi'")
π Makes your code more readable.
π« Common Mistakes
- β Forgetting parentheses in multiple conditions
- β Using
andinstead of& - β Using
orinstead of| - β Not handling case sensitivity in strings
π― Why Filtering is Important
Filtering helps you:
- Focus on important data
- Reduce processing time
- Improve analysis accuracy
π External Resources
- Pandas Docs: https://pandas.pydata.org/docs/
- Boolean Indexing: https://pandas.pydata.org/docs/user_guide/indexing.html
π Conclusion
Data filtering is one of the most essential skills in Pandas.
Once mastered, you can quickly extract insights from even the largest datasets.
π Practice different conditions and combinations to become confident.
π Hashtags
#Pandas #DataFiltering #Python #DataAnalysis #Coding #MachineLearning #AI #Developers #LearnPython #Analytics