PVAnalytics¶
PVAnalytics is a python library that supports analytics for PV systems. It provides functions for quality control, filtering, and feature labeling and other tools supporting the analysis of PV system-level data.
The source code for PVAnalytics is hosted on github.
Library Overview¶
The functions provided by PVAnalytics are organized in submodules based on their anticipated use. The list below provides a general overview; however, not all modules have functions at this time, see the API reference for current library status.
quality
contains submodules for different kinds of data quality checks.quality.irradiance
contains quality checks for irradiance measurements.quality.weather
contains quality checks for weather data (e.g. tests for physically plausible values of temperature, wind speed, humidity).quality.outliers
contains functions for identifying outliers.quality.gaps
contains functions for identifying gaps in the data (i.e. missing values, stuck values, and interpolation).quality.time
quality checks related to time (e.g. timestamp spacing, time shifts).quality.util
general purpose quality functions (e.g. simple range checks).
filtering
as the name implies, contains functions for data filtering.features
contains submodules with different methods for identifying and labeling salient features.features.clipping
functions for labeling inverter clipping.features.clearsky
functions for identifying periods of clear sky conditions.features.daytime
functions for identifying periods of day and night.features.orientation
functions for identifying orientation-related features in the data (e.g. days where the data looks like there is a functioning tracker). These functions are distinct from the functions in thesystem
module in that we are identifying features of data rather than properties of the system that produced the data.features.shading
functions for identifying shadows.
system
identification of PV system characteristics from data (e.g. nameplate power, tilt, azimuth)translate
contains functions for translating data to other conditions (e.g. IV curve translators, temperature adjustment, irradiance adjustment)metrics
contains functions for computing PV system-level metrics (e.g. performance ratio)fitting
contains submodules for different types of models that can be fit to data (e.g. temperature models)dataclasses
contains classes for normalizing data (e.g. anIVCurve
class)
Dependencies¶
This project follows the guidelines laid out in NEP-29. It supports:
All minor versions of Python released 42 months prior to the project, and at minimum the two latest minor versions.
All minor versions of numpy released in the 24 months prior to the project, and at minimum the last three minor versions
The latest release of PVLib.
PVAnalytics depends on the following packages:
numpy>=1.15.0
pandas>=0.24.0,!=1.1.*
pvlib>=0.8.0
scipy>=1.2.0
statsmodels>=0.9.0
scikit-image>=0.16.0
Contents¶
API Reference¶
Quality¶
Irradiance¶
The check_*_limits_qcrad
functions use the QCRad algorithm 1 to
identify irradiance measurements that are beyond physical limits.
Test for physical limits on GHI using the QCRad criteria. |
|
Test for physical limits on DHI using the QCRad criteria. |
|
Test for physical limits on DNI using the QCRad criteria. |
All three checks can be combined into a single function call.
Test for physical limits on GHI, DHI or DNI using the QCRad criteria. |
Irradiance measurements can also be checked for consistency.
Check consistency of GHI, DHI and DNI using QCRad criteria. |
GHI and POA irradiance can be validated against clearsky values to eliminate data that is unrealistically high.
|
Identify irradiance values which do not exceed clearsky values. |
You may want to identify entire days that have unrealistically high or low insolation. The following function examines daily insolation, validating that it is within a reasonable range of the expected clearsky insolation for the same day.
Check that daily insolation lies between minimum and maximum values. |
Gaps¶
Identify gaps in the data.
|
Identify sequences which appear to be linear. |
Data sometimes contains sequences of values that are “stale” or “stuck.” These are contiguous spans of data where the value does not change within the precision given. The functions below can be used to detect stale values.
Note
If the data has been altered in some way (i.e. temperature that has been rounded to an integer value) before being passed to these functions you may see unexpectedly large amounts of stale data.
|
Identify stale values in the data. |
|
Identify stale values by rounding. |
The following functions identify days with incomplete data.
|
Calculate a data completeness score for each day. |
|
Select data points that are part of days with complete data. |
Many data sets may have leading and trailing periods of days with sporadic or no data. The following functions can be used to remove those periods.
|
Get the start and end of data excluding leading and trailing gaps. |
|
Mask the beginning and end of the data if not all True. |
|
Trim the series based on the completeness score. |
Outliers¶
Functions for detecting outliers.
|
Identify outliers based on the interquartile range. |
|
Identify outliers using the z-score. |
|
Identify outliers by the Hampel identifier. |
Time¶
Quality control related to time. This includes things like time-stamp spacing, time-shifts, and time zone validation.
|
Check that the spacing between times conforms to freq. |
Timestamp shifts, such as daylight savings, can be identified with the following functions.
|
Identify time shifts using the ruptures library. |
|
Return True if events appears to have daylight savings shifts at the dates on which tz transitions to or from daylight savings time. |
Utilities¶
The quality.util
module contains general-purpose/utility
functions for building your own quality checks.
|
Check whether a value falls withing the given limits. |
|
Return True for data on days when the day's minimum exceeds minimum. |
Weather¶
Quality checks for weather data.
Identify relative humidity values that are within limits. |
|
|
Identify temperature values that are within limits. |
|
Identify wind speed values that are within limits. |
In addition to validating temperature by comparing with limits, module
temperature should be positively correlated with irradiance. Poor
correlation could indicate that the sensor has become detached from
the module, for example. Unlike other functions in the
quality
module which return Boolean masks over the input
series, this function returns a single Boolean value indicating
whether the entire series has passed (True
) or failed (False
)
the quality check.
Test whether the module temperature is correlated with irradiance. |
References
- 1
C. N. Long and Y. Shi, An Automated Quality Assessment and Control Algorithm for Surface Radiation Measurements, The Open Atmospheric Science Journal 2, pp. 23-37, 2008.
Features¶
Functions for detecting features in the data.
Clipping¶
Functions for identifying inverter clipping
|
Label clipping in AC power data based on levels in the data. |
|
Detect clipping based on a maximum power threshold. |
|
Identify clipping based on a the shape of the ac_power curve on each day. |
Clearsky¶
|
Identify times when GHI is consistent with clearsky conditions. |
Orientation¶
System orientation refers to mounting type (fixed or tracker) and the azimuth and tilt of the mounting. A system’s orientation can be determined by examining power or POA irradiance on days that are relatively sunny.
This module provides functions that operate on power or POA irradiance to identify system orientation on a daily basis. These functions can tell you whether a day’s profile matches that of a fixed system or system with a single-axis tracker.
Care should be taken when interpreting function output since other factors such as malfunctioning trackers can interfere with identification.
|
Flag days that match the profile of a fixed PV system on a sunny day. |
|
Flag days that match the profile of a single-axis tracking PV system on a sunny day. |
Daytime¶
Functions that return a Boolean mask indicating day and night.
Return True for values that are during the day. |
Shading¶
Functions for labeling shadows.
|
Detects shadows from fixed structures such as wires and poles. |
System¶
This module contains functions and classes relating to PV system parameters such as nameplate power, tilt, azimuth, or whether the system is equipped with tracker.
Tracking¶
|
Enum describing the orientation of a PV System. |
|
Infer whether the system is equipped with a tracker. |
Orientation¶
The following function can be used to infer system orientation from power or plane of array irradiance measurements.
Determine system azimuth and tilt from power or POA using solar azimuth at the daily peak. |
|
|
Get the tilt and azimuth that give PVWatts output that most closely fits the data in power_ac. |
Metrics¶
Performance Ratio¶
The following functions can be used to calculate system performance metrics.
|
Calculate NREL Performance Ratio. |
Variability¶
Functions to calculate variability statistics.
|
Calculate the variability index. |
Example Gallery¶
This gallery shows examples of pvanalytics functionality. Community contributions are welcome!
Note
Click here to download the full example code
Clear-Sky Detection¶
Identifying periods of clear-sky conditions using measured irradiance.
Identifying and filtering for clear-sky conditions is a useful way to
reduce noise when analyzing measured data. This example shows how to
use pvanalytics.features.clearsky.reno()
to identify clear-sky
conditions using measured GHI data. For this example we’ll use
GHI measurements from NREL in Golden CO.
import pvanalytics
from pvanalytics.features.clearsky import reno
import pvlib
import matplotlib.pyplot as plt
import pandas as pd
import pathlib
First, read in the GHI measurements. For this example we’ll use an example file included in pvanalytics covering a single day, but the same process applies to data of any length.
pvanalytics_dir = pathlib.Path(pvanalytics.__file__).parent
ghi_file = pvanalytics_dir / 'data' / 'midc_bms_ghi_20220120.csv'
data = pd.read_csv(ghi_file, index_col=0, parse_dates=True)
# or you can fetch the data straight from the source using pvlib:
# date = pd.to_datetime('2022-01-20')
# data = pvlib.iotools.read_midc_raw_data_from_nrel('BMS', date, date)
measured_ghi = data['Global CMP22 (vent/cor) [W/m^2]']
Now model clear-sky irradiance for the location and times of the measured data:
location = pvlib.location.Location(39.742, -105.18)
clearsky = location.get_clearsky(data.index)
clearsky_ghi = clearsky['ghi']
Finally, use pvanalytics.features.clearsky.reno()
to identify
measurements during clear-sky conditions:
is_clearsky = reno(measured_ghi, clearsky_ghi)
# clear-sky times indicated in black
measured_ghi.plot()
measured_ghi[is_clearsky].plot(ls='', marker='o', ms=2, c='k')
plt.ylabel('Global Horizontal Irradiance [W/m2]')
plt.show()

Total running time of the script: ( 0 minutes 0.302 seconds)
Release Notes¶
These are the bug-fixes, new features, and improvements for each release.
0.1.1 (February 18, 2022)¶
Enhancements¶
Quantification of irradiance variability with
pvanalytics.metrics.variability_index()
. (GH60, GH106)Internal refactor of
pvanalytics.metrics.performance_ratio_nrel()
to support other performance ratio formulas. (GH109)Detect shadows from fixed objects in GHI data using
pvanalytics.features.shading.fixed()
. (GH24, GH101)
Bug Fixes¶
Added
nan_policy
parameter to zscore calculation inpvanalytics.quality.outliers.zscore()
. (GH102, GH108)Prohibit pandas versions in the 1.1.x series to avoid an issue in
.groupby().rolling()
. Newer versions starting in 1.2.0 and older versions going back to 0.24.0 are still allowed. (GH82, GH118)Fixed an issue with
pvanalytics.features.clearsky.reno()
in recent pandas versions (GH125, GH128)Improved convergence in
pvanalytics.features.orientation.fixed_nrel()
(GH119, GH120)
Requirements¶
Drop support for python 3.6, which reached end of life Dec 2021 (GH129)
Documentation¶
Started an example gallery and added an example for
pvanalytics.features.clearsky.reno()
(GH125, GH127)
Contributors¶
Kevin Anderson (@kanderso-nrel)
Cliff Hansen (@cwhanse)
Will Vining (@wfvining)
Kirsten Perry (@kperrynrel)
Michael Hopwood (@MichaelHopwood)
Carlos Silva (@camsilva)
Ben Taylor (@bt-)
0.1.0 (November 20, 2020)¶
This is the first release of PVAnalytics. As such, the list of “changes” below is not specific. Future releases will describe specific changes here along with references to the relevant github issue and pull requests.
API Changes¶
Enhancements¶
Quality control functions for irradiance, weather and time series data. See
pvanalytics.quality
for content.Feature labeling functions for clipping, clearsky, daytime, and orientation. See
pvanalytics.features
for content.System parameter inference for tilt, azimuth, and whether the system is tracking or fixed. See
pvanalytics.system
for content.NREL performance ratio metric (
pvanalytics.metrics.performance_ratio_nrel()
).
Bug Fixes¶
Contributors¶
Special thanks to Matt Muller and Kirsten Perry of NREL for their assistance in adapting components from the PVFleets QA project to PVAnalytics.