Lists
# uncomment the following line to install the pandas library
!pip install pandas
'''Pandas is used to gather data sets through its DataFrames implementation'''
import pandas as pd
Requirement already satisfied: pandas in /home/mihirunix/nighthawk/MihirCSP/.venv/lib/python3.12/site-packages (2.2.3)
Requirement already satisfied: numpy>=1.26.0 in /home/mihirunix/nighthawk/MihirCSP/.venv/lib/python3.12/site-packages (from pandas) (2.2.4)
Requirement already satisfied: python-dateutil>=2.8.2 in /home/mihirunix/nighthawk/MihirCSP/.venv/lib/python3.12/site-packages (from pandas) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /home/mihirunix/nighthawk/MihirCSP/.venv/lib/python3.12/site-packages (from pandas) (2025.2)
Requirement already satisfied: tzdata>=2022.7 in /home/mihirunix/nighthawk/MihirCSP/.venv/lib/python3.12/site-packages (from pandas) (2025.2)
Requirement already satisfied: six>=1.5 in /home/mihirunix/nighthawk/MihirCSP/.venv/lib/python3.12/site-packages (from python-dateutil>=2.8.2->pandas) (1.17.0)
Popcorn Hack 1
import pandas as pd
# Example student data
student_data = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emma', 'Fiona', 'George'],
'Score': [95, 88, 76, 92, 84, 67, 89],
'Grade': ['A', 'B', 'C', 'A', 'B', 'D', 'B']
})
# print(student_data)
def find_students_in_range(df, min_score, max_score):
return df[(df['Score'] >= min_score) & (df['Score'] <= max_score)]
# Test
result = find_students_in_range(student_data, 80, 90)
print(result)
Name Score Grade
1 Bob 88 B
4 Emma 84 B
6 George 89 B
Popcorn Hack 2
def add_letter_grades(df):
def letter(score):
if score >= 90:
return 'A'
elif score >= 80:
return 'B'
elif score >= 70:
return 'C'
elif score >= 60:
return 'D'
else:
return 'F'
df['Letter'] = df['Score'].apply(letter)
return df
# Test
graded = add_letter_grades(student_data)
print(graded)
Name Score Grade Letter
0 Alice 95 A A
1 Bob 88 B B
2 Charlie 76 C C
3 David 92 A A
4 Emma 84 B B
5 Fiona 67 D D
6 George 89 B B
Popcorn Hack 3
def find_mode(series):
return series.mode().iloc[0] if not series.mode().empty else None
# Test
mode = find_mode(pd.Series([1, 2, 2, 3, 4, 2, 5]))
print("Mode:", mode)
Mode: 2
General Setup
import pandas as pd
smartpark_df = pd.read_csv('/home/mihirunix/nighthawk/MihirCSP/datasets/treas_parking_meters_loc_datasd.csv')
smartpark_df.head()
zone | area | sub_area | pole | config_id | config_name | date_inventory | lat | lng | sapid | |
---|---|---|---|---|---|---|---|---|---|---|
0 | Downtown | Core | 1000 FIRST AVE | 1-1004 | 49382 | Sunday Mode | 2021-01-04 00:00:00 | 32.715904 | -117.163929 | SS-000031 |
1 | Downtown | Core - Columbia | 1000 FIRST AVE | 1-1004 | 9000 | 2 Hour Max $1.25 HR 8am-6pm Mon-Sat (NFC) | 2018-11-11 00:00:00 | 32.715904 | -117.163929 | SS-000031 |
2 | Downtown | Core | 1000 FIRST AVE | 1-1006 | 49382 | Sunday Mode | 2021-01-04 00:00:00 | 32.716037 | -117.163930 | SS-000031 |
3 | Downtown | Core - Columbia | 1000 FIRST AVE | 1-1006 | 9000 | 2 Hour Max $1.25 HR 8am-6pm Mon-Sat (NFC) | 2018-11-11 00:00:00 | 32.716037 | -117.163930 | SS-000031 |
4 | Downtown | Core | 1000 FIRST AVE | 1-1008 | 49382 | Sunday Mode | 2021-01-04 00:00:00 | 32.716169 | -117.163931 | SS-000031 |
✅ Hack 1: Find Configurations Older than a Certain Year
Find all entries in the dataset prior to 2019.
smartpark_df['date_inventory'] = pd.to_datetime(smartpark_df['date_inventory'])
older_configs = smartpark_df[smartpark_df['date_inventory'].dt.year < 2019]
older_configs.head()
zone | area | sub_area | pole | config_id | config_name | date_inventory | lat | lng | sapid | |
---|---|---|---|---|---|---|---|---|---|---|
1 | Downtown | Core - Columbia | 1000 FIRST AVE | 1-1004 | 9000 | 2 Hour Max $1.25 HR 8am-6pm Mon-Sat (NFC) | 2018-11-11 | 32.715904 | -117.163929 | SS-000031 |
3 | Downtown | Core - Columbia | 1000 FIRST AVE | 1-1006 | 9000 | 2 Hour Max $1.25 HR 8am-6pm Mon-Sat (NFC) | 2018-11-11 | 32.716037 | -117.163930 | SS-000031 |
5 | Downtown | Core - Columbia | 1000 FIRST AVE | 1-1008 | 9000 | 2 Hour Max $1.25 HR 8am-6pm Mon-Sat (NFC) | 2018-11-11 | 32.716169 | -117.163931 | SS-000031 |
7 | Downtown | Core - Columbia | 1000 FIRST AVE | 1-1020 | 9115 | 15 Min Max $1.25 HR 8am-6pm Mon-Sat (NFC) | 2018-11-11 | 32.717890 | -117.161278 | SS-000031 |
9 | Downtown | Core - Columbia | 1300 FIRST AVE | 1-1310 | 16516 | 2 Hour Max $1.25 HR 8am-4pm Mon-Fri 4pm-6pm TO... | 2018-11-11 | 32.719024 | -117.163951 | SS-000028 |
✅ Hack 2: Count Configs by Area
configs_by_area = smartpark_df.groupby('area')['config_id'].count().reset_index(name='Config Count')
configs_by_area.head()
area | Config Count | |
---|---|---|
0 | Bankers Hill | 1787 |
1 | Barrio Logan | 69 |
2 | CBD - Trial | 19 |
3 | College | 10 |
4 | Columbia | 163 |
✅ Hack 3: Find the Most Common Parking Configuration
most_common_config = smartpark_df['config_name'].mode().iloc[0]
print("Most common config:", most_common_config)
Most common config: Sunday Mode
✅ Hack 4: Compare Configs per Zone Over Time
smartpark_df['year'] = smartpark_df['date_inventory'].dt.year
configs_by_zone_year = smartpark_df.groupby(['zone', 'year']).size().reset_index(name='Config Count')
configs_by_zone_year.head()
zone | year | Config Count | |
---|---|---|---|
0 | City | 2018 | 18 |
1 | City | 2019 | 62 |
2 | City | 2021 | 75 |
3 | Downtown | 2018 | 1814 |
4 | Downtown | 2019 | 1486 |
✅ Hack 5: Count Unique GPS Locations
unique_locations = smartpark_df[['lat', 'lng']].drop_duplicates().shape[0]
print("Unique parking locations (lat/lng pairs):", unique_locations)
Unique parking locations (lat/lng pairs): 5308
🧠 Summary
- Oldest config is before 2019:
"2 Hour Max $1.25 HR 8am-6pm Mon-Sat (NFC)"
- Most common configuration:
"Sunday Mode"
- Total unique GPS locations:
5308
- Grouping, filtering, and traversal were performed using Pandas!