Drive Real-Time Decisions with Clean, Confident Data
Contributed By: Michael Schmitz, Software Product Manager, Schneider Electric Energy & Sustainability Services
The business case for energy efficiency is becoming more attractive every day: energy is now a boardroom level topic, metering devices are becoming cheaper and expertise needed to take advantage of opportunities is becoming more accessible. As a result, interval energy data flows like water; and this river of data is the foundation upon which good energy decisions can be made.
Just like a river, though, energy data can quickly become polluted, diverted, dammed or dried up. Data can be polluted by erroneous readings, hardware-driven faulty values, meter swap-outs or rollovers and any number of other sources of error. Data streams can be diverted by misconfiguration (usually human error) or dammed by communication outages or IT security changes. Or, data can dry up when a meter simply dies.
Interval data quality is routinely a significant issue, especially with larger enterprise systems that have disparate data collection hardware and software architectures. When there are thousands of meters in the system, its safe to assume that there are always at least a few that are affected by issues.
Interval data quality is a fundamental, ubiquitous challenge in energy management. Failing to solve this challenge hamstrings all downstream activities such as equipment efficiency analysis, opportunity identification for energy efficiency improvements and big picture reporting.
However, data quality is often regarded as the least exciting challenge in energy management. But identifying issues, and solving them, removes barriers for future energy and sustainability initiatives. Having clean data sets does not always directly lead to finding major energy savings, or guarantee huge savings. In fact, data quality concerns may be practically invisible—at least at first.
Making matters worse, until a company experiences the pain of poor data quality, it can seem like a minor issue. One not worth paying for. Many organizations launch energy metering initiatives thinking “we have good quality meters” or “our network has always been reliable, and we have good IT staff”. Nobody expects they will suffer from poor data quality.
What’s another thing that nobody has ever done? Complained data was too clean and too reliable.
So, what does a good solution to these diverse data quality problems look like?
A good data management system must enable analysis and reporting, and must stay out of the way of core business priorities, operating in the background to minimize distractions and avoid eating up valuable time. Most importantly, it must increase data confidence. A system that masks problems or, conversely, constantly shines the light on every little issue, can reduce end user confidence, limiting a company's ability to make data-driven decisions about their energy use. The right solution needs to walk a fine line between identifying issues and fixing issues.
- Identifying issues is the obvious starting point. It’s imperative to know where data quality issues stem from. Are certain meter types not reliable? Is there a software aggregator system somewhere that’s not performing? Perhaps a certain geography has issues that relate back to local IT policies? Identification of issues can go too far though. No one wants their annual site-level consumption reports to be littered with asterisks on every value warning that one of the 25 contributing meters had a 36-hour communications outage somewhere during the year. Over-identification of issues, or more specifically, the identification of issues in the wrong context or that are not material to decision-making can be harmful. It erodes confidence in the data.
- Fixing issues is more difficult and needs to be done judiciously. Fortunately, the causes of most data quality problems fall into a few specific categories, each of which has a detectable signature in the underlying cumulative data streams that typically flow from metering devices. Many data problems can therefore be “patched up” by applying techniques specific to each of the root causes. These usually involve inserting estimates and removing obvious outliers or bad values.
Good data quality assurance systems will patch and identify issues. The best systems patch problems, identify problems when relevant, and then fade away into the background when the context demands.
For example, consider this scenario where a good datamanagement system does, and does not, flag data issues. When looking at a chart of hourly or daily values, a 36-hour meter communications outage needs to be clearly identified, even if estimates are put in place. But the same outage does not need to be mentioned when looking at the year’s total consumption. Flagging the yearly total as “estimated” or “containing estimates” would be incorrect (due to the nature of cumulative counting meters) and would only lead to a loss of confidence in the data.
Data confidence is key in driving data-informed energy decisions. Having confidence and trust in your systems to deliver quality data and flag and correct errors when appropriate is even more pertinent. In a world with more and more data availability, knowing that your data management system is driving real-time decisions is imperative. Learn more about how EcoStruxure™ Resource Advisor, a global energy and sustainability management software, enables data confidence and transparency for organizations like yours.