I’ve been working with a lot of residential electric smart meter data for a deep learning load forecasting project, and I’ve come up with a visualization that helps me uncover useful insights into the data, including quality and completeness.
The Y axis lists the residential smart meters in the dataset by name. The X axis is time, in this case roughly two years. The pink/peach lines indicate usage in KWH per half-hour.
The large white rectangles show missing meter reads where meters came on/offline during the reporting period.
The thin white lines show intermittent missing meter reads that can be filled in by interpolation or some other method.
It's easy to identify which meters have the highest regular load, and the highest peak loads.
How did I do it?
Loaded the smart meter data into this Data in this Pandas dataframe…
d.info()
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 LCLid 999287 non-null object
1 stdorToU 999287 non-null object
2 DateTime 999287 non-null datetime64[ns]
3 KWHperHH 999287 non-null float64
Then use Seaborn heatmap for visualization…
pivot_table = pd.pivot_table(d, columns='DateTime', index='LCLid', values='KWHperHH')
plt.subplots(figsize=(20,15))
sns.heatmap(pivot_table, xticklabels=2000)
Data credit: