AI/ML/DL

Mastering Pandas for Data Analysis in Python: A Practical Guide

Pandas (python data analysis library) is a library designed for data manipulation and analysis. It provides two primary data structures:

  • Series: A one-dimensional labeled array.
  • DataFrame: A two-dimensional labeled array, which is essentially a table of data.

Pandas is great for handling structured data (like tables, CSV files, databases) and offers a wide range of tools for data cleaning, exploration, and analysis.

Key Features:

  • Series and DataFrames: Used for representing and manipulating data.
  • Data manipulation: Filtering, grouping, joining, and reshaping data.
  • Handling missing data: Easy handling of missing or NA values.
  • Reading/writing data: Supports reading from and writing to multiple formats (CSV, Excel, SQL, etc.).

Example:

Creating DataFrame:

import pandas as pd

# Creating a DataFrame from a dictionary
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 30, 35, 40],
    'City': ['New York', 'San Francisco', 'Los Angeles', 'Chicago']
}
df = pd.DataFrame(data)
print(df)

Basic DataFrame Operations:

# Selecting a column
print(df['Name'])

# Filtering data
filtered_df = df[df['Age'] > 30]
print(filtered_df)

# Adding a new column
df['Salary'] = [50000, 60000, 70000, 80000]
print(df)

# Grouping data
grouped = df.groupby('City').mean()
print(grouped)

Advance Example

# Merging two DataFrames
df1 = pd.DataFrame({
    'ID': [1, 2, 3],
    'Name': ['A', 'B', 'C']
})
df2 = pd.DataFrame({
    'ID': [1, 2, 4],
    'City': ['X', 'Y', 'Z']
})

merged_df = pd.merge(df1, df2, on='ID', how='outer')
print(merged_df)

Perfect for data analysis, handling structured data, and performing operations like filtering, grouping, and merging.

 


About author

author image

Amrit panta

Fullstack developer, content creator



Scroll to Top