Our data is measured from approximately 5000 US ground stations and
another 3500 international stations. With few exceptions, our data has
half the error of satellite and modeled data. This difference is enough to
make a solar plant appear profitable on paper when in fact it is not!
The US daily data is collected every week from dozens of sources such as the USDA, National
Forest Service, and universities. This process takes between 40 and 50
hours. After collecting the data, we convert all of the disparate formats
into a single database with consistent units and perform multiple levels of
quality control - including custom solar envelopes for each NOAA
climate region and nearby station comparisons. Finally, we generate the
contour maps for our
monthly atlas.
We have learned after two years that it is impossible to automate this process
completely, so weekly maintenance is required to preserve the integrity of the
data.
Data Accuracy
We recently consulted with a large, multi-national company who relied heavily on
accurate solar radiation data for their agricultural research. They proudly
described their real-time data feed with solar radiation data for every 5 km
block of the US, updated every 15 minutes. We asked how the measurements were
made. They didn’t know, but gave us the name of their supplier. We contacted the
supplier. The supplier didn’t know how the measurements were made – they simply
provided the surface modeling and gridding. We obtained the name of their
supplier, and so on, only to find that there was not a single measurement of
solar radiation anywhere in the chain. All of this mission-critical information
on solar radiation was being estimated from cloud cover and humidity at
airports. Desktop software makes it easy to fit data with colorful
contours and interpolate the surface down to very fine grids. It can give the
user a false sense of precision and makes it easy to mask the key question:
How
accurate is the data?
US Solar Radiation Datasets
Historical datasets of solar radiation are a key element in designing solar
power systems and energy efficient buildings; however finding accurate
multi-year data near the design site has always proved challenging. There are
only 100-200 sites in the US providing research-quality observations of solar
radiation, so this data is generally not available for engineering or
architectural purposes.
Typical Meteorological Year
Perhaps the solar radiation dataset most widely used by US engineers and
architects is the Typical Meteorological Year version 3 (TMY3) from the National
Renewable Energy Laboratory (NREL). Each month in a TMY3 dataset contains
historical observations. Twelve specific months from a 10-30 year history were
selected as representative of the location and concatenated into a typical
meteorological year. TMY3 data is intended only for relative comparisons of
designs at one location or estimates of long-term solar radiation at a site, not
for detailed engineering design or simulations (see
user manual).
Satellite-Based Observations
The best known alternative to the TMY3 dataset is
satellite-based observations of solar radiation. In fact, a significant portion
of the NREL’s TMY3 incorporates SUNY (State University of New York) gridded
satellite data. All satellite observations of solar radiation are modeled, since
they must estimate ground radiation based on clouds and atmospheric conditions.
The most widely used models in the US were developed by
Perez, et al. Satellite
datasets have the advantage of complete coverage, but the models have known
inaccuracies due to persistent clouds, snow cover, and microclimates that can
occur near mountains or large bodies of water.
The National Solar Radiation
Database provides free access to gridded satellite observations for 2000-2005
and recent SUNY satellite data is available from several commercial suppliers.
NASA also provides free access to their satellite-based observations on their
POWER (Prediction of World Energy Resource) website.
Medium-Quality Ground Stations
One significant but relatively untapped solar
resource is data from ground-based stations equipped with medium quality solar
sensors. In the US there are approximately 5000 of these sites from many different
networks with daily, hourly, and sub-hourly observations for the past 5-25
years. All are professionally run and maintained by universities and government
agencies for specific purposes such as agriculture, water management and
environmental monitoring. Representative examples are the AgWeatherNet from
Washington State University and the Oklahoma Mesonet. These networks overlap, so
typically there are stations from three or four networks operating
simultaneously in the same area. Most of these observations are available to the
public via the internet for free or for a modest access fee. Wider use of this
resource has been limited by a general lack of knowledge about the networks and
how to access data. There have also been concerns about accuracy, quality
control and difficulties in converting the data to a format usable for solar
project simulations. One commercial supplier, the Solar Data Warehouse, has
aggregated most of this data into a single database including quality control
measures. The next section is a summary of a study comparing the observations from TMY3, NASA and SUNY datasets to this network of
medium-quality stations. It will show that they provide a significant and
surprisingly accurate resource for solar radiation data.
Calculating the Accuracy of Solar Radiation Observations
The accuracy of a dataset is determined by comparing the observations to a
highly accurate reference. Even small differences in location can affect the
amount of solar radiation on the ground, especially for short time intervals, so
comparisons should be done at exactly the same location. This can prove
challenging for solar radiation observations where test sites are near, but not
co-located, with the reference. In addition, extremely accurate reference
measurements of solar radiation are not available. In 1989 the World Climate
Research Program estimated that routine-operational ground solar radiation sites
had end-to-end inaccuracies of 6-12%, with the highest quality research sites in
the range of 3-6% inaccuracy (reference).
These constraints make absolute comparisons between solar radiation datasets
difficult, but it is still possible to estimate the relative accuracy if the
same reference observations are used. For this study, research-quality
observations from USCRN (US Climate Reference Network) were used as a reference.
Most of these stations began operation between 2003 and 2005. The relative mean
absolute error (rMAE) statistic was used to estimate the total error (bias plus
precision) in the observations. None of the observations were co-located, so the
total error includes effects due to physical separation, or in the case of
satellite data, the grid size. The rMAE provides an easy way for practitioners
to estimate how much error to expect in the data. The calculations only included
comparisons where the observations of GHI (global horizontal irradiance) at the
reference station were greater than 10 W/m2 so the statistics would not be
skewed by low-light conditions.
Relative Accuracy of Various Solar Radiation Datasets
Seven of the earliest USCRN stations were operating in the southern half of the
US during 2002-2005. Nearly 13,000 hourly measurements from 19 TMY3 months could
be paired with those from USCRN sites at the same time and location. This
overlap allowed direct comparison between the historical observations in various
TMY3 months to high-quality ground observations. SUNY data, NASA data and
observations from nearby medium-quality ground stations were also included in
the comparison. Each of the medium-quality ground stations were within 20 miles
and 500 feet elevation of the USCRN reference stations. The total errors (rMAE)
in the observations of GHI are compared in Figure 5.
 |
 |
Figure 5 - Total errors in observations of global
horizontal irradiance from
various sources. |
Figure 6 - Bias errors in observations of global
horizontal irradiance from
various sources. |
The NASA observations had
the highest daily total error (27%), TMY3 and SUNY had similar errors (19%) and
the medium-quality ground measurements showed significantly lower errors (9%).
The monthly total errors were similar for all datasets. The bias errors (rME) in
the observations of GHI are shown in Figure 6. NASA observations had the highest
daily bias error (16.7%), TMY3 and SUNY had similar daily bias (9.2% and 8.1%)
and the medium-quality ground measurements showed significantly lower daily bias
error (3.1%). The monthly bias errors were similar for all datasets. The similarity of
errors between the TMY3 and SUNY datasets should not be surprising, since a
significant portion of the TMY3 data comes from the SUNY gridded data in the
National Solar Radiation Database. Leaving out TMY3, a more comprehensive
comparison can be made between the NASA, SUNY and ground-based data. The next
comparison used all of the data from 2002-2005 where there was overlap between
the SUNY gridded data in the National Solar Radiation Database and seven USCRN
stations in the southern half of the US (259 location- months of data). NASA
satellite data for the locations was obtained from their POWER website. The
Solar Data Warehouse provided corresponding ground-based measurements from one
or more stations at each area. All ground stations were within 20 miles and 500
feet elevation of the USCRN reference stations. The same procedure was used to
calculate the total and bias errors in the various datasets. Figures 7 and 8
show that daily observations from medium-quality ground stations had less than
half the errors of the NASA and SUNY observations.
 |
 |
Figure 7 - Total errors in observations of global
horizontal irradiance from various sources. |
Figure 8 - Bias errors in observations of global
horizontal irradiance from various sources. |
These results are similar to
other published comparisons. NASA estimates that their measurements of daily
solar radiation have an RMS error of 35 W/m2 (roughly 20% total error see reference).
Other researchers comparing NASA solar radiation data found 19% total error in
the daily observations (reference).
Accuracy of the Reference Stations
Within the USCRN network there are several pairs of stations in close proximity.
This provides an opportunity to see if the observation errors between two
high-quality stations are due mainly to sensor accuracy or separation distance.
This comparison used 2003-2005 data from the paired USCRN stations in Lincoln
NE, Newton GA, Stillwater, OK and Asheville NC. The separation distances between
these pairs were 18 miles, 6 miles, 1.5 miles and 6 miles respectively. The
solar radiation sensors used by the USCRN stations are rated at less than 1%
non-linearity and ±2% stability per year, however figure 9 shows much higher
total and bias errors in the observations. The total and bias errors between two
nearby high-quality stations were remarkably close to the total and bias errors
between a medium-quality and a nearby high-quality station (Figure 9). This
suggests that the errors in the ground-based observations may be more influenced
by physical separation than by the difference between high-quality and
medium-quality radiation sensors.
Figure 9 -
Total and bias errors in global horizontal
irradiance observations from paired high-quality stations
Summary
Vignola, et al. suggest that there is a place for both satellite and
ground-based measurements in forming a comprehensive solar radiation database
for the entire US, with satellite data providing general coverage augmented by
ground-based data for detailed engineering and scientific purposes. The findings
of this study support this conclusion.
|
Our Technical Papers
The following papers were presented at the 2011 ASES conference
in Raleigh:
Quality Analysis of Global Horizontal Irradiance Data from 3500 U.S.
Ground-Based Weather Stations by James Hall
Links:
paper |
presentation
Note: We're up to 5000 stations now!
Forecasting Solar Radiation for the Los Angeles Basin - Phase II Report
by James Hall
Links: paper
| presentation
Note: We have the most accurate published forecasting technology for the U.S.
proven by out-of-sample tests.
Coverage
Free Data
Do you think our data should be free to the general public?
Talk to NREL about it. Our data is better than theirs and costs a small fraction of what they're spending.
Anyway, here are some free data options:
NASA Agroclimate Data - It has high error compared to others, but is
provided free as a public service.
NREL's TMY3 Data - This data can be very good for monthly averages but is
terrible for hourly and daily data. NREL says "
The TMY should not be
used to predict weather for a particular period of time, nor is it an
appropriate basis for evaluating real-time energy production or efficiencies for
building design applications or solar conversion systems."
(
TMY3 User Manual).
NREL's SUNY Gridded Satellite Data (2000-2005) - This data is better
than the TMY3 data but still not nearly as accurate as ground-based
observations. In fact, some of the TMY3 dataset comes from the SUNY
satellite data.
We have a paper that explains the strengths and weaknesses of these data sources
in detail. It was presented at the 2011 ASES conference in Raleigh.
Check it out.
Ground Station Data Sources
Community Environmental Monitoring Program (funded by DOE)
Federal Aviation Administration (FAA)
National Oceanic and Atmospheric Administration (NOAA)
National Renewable Energy Laboratory (NREL)
National Weather Service (NWS)
Oklahoma Mesonet
Special projects funded by various state climate centers and universities
US Bureau of Land Management (BLM)
US Bureau of Reclamation (USBR)
US Department of Agriculture (USDA)
US Department of Commerce (DOC)
US Department of Defense (DOD)
US Department of Energy (DOE)
US Department of the Interior (DOI)
US Forestry Service
Other state and federal agencies
Sample Data
Files
Sample Hourly Product
- This is a 2 day sample of hourly data from several stations with a custom
analysis of the sensor accuracy. We believe it's a mistake to automate
everything, so we like to get our hands in the data and make sure things look
right.
3 Station Comparison -
This is an
unformatted file with sample raw data from 3 stations near Palmdale, CA.
Single Station Daily Summary -
This file
contains raw daily data with averages for a station near Susanville, CA.
We're experienced database programmers, so let us know what kind of product we
can generate for you!
Contour Maps
The first three maps were generated from 3600 sites. We're now up to 5000.
For the latest data, check out the monthly atlas.
June 12, 2008
June 14, 2008
June 16, 2008
Monthly Atlas
Data Pricing
Here's our latest price sheet for data.
For solar forecasting information or custom quotes, please email:
