By 2020, the European Union (EU) aims to replace 80% of household electricity meters with smart meters. This is because smart meters are widely regarded as a key to reducing both energy consumption and emission levels. Along with smart meter technology, however, comes some challenges in protecting consumer’s data privacy. This blog describes these privacy concerns as well as a potential solution to address them.
The risk of leaking smart meter readings
With the help of machine learning and advanced data analytics, several load-monitoring techniques can deduce a customer’s activity patterns just from their electricity metering data. Imagine a situation where a third party gets hold of your metering data and, using data mining algorithms, and can detect your presence or absence and other household activities. Wouldn’t that be devastating?
While the simplest solution to avoid this type of risk lies in not sharing your metering data with anyone, including your utility provider, which is easier said than done. The utility provider requires your power consumption data for billing, data aggregation and for estimating other physical elements critical to the grid operation that may not be measurable using sensors.
This motivates us to solve the following business challenge:
How can metering data be shared securely for estimation purposes,
while preserving a customer’s privacy?
Cryptography to the rescue
In my master thesis work in collaboration with the Delft Center for Systems and Control (DCSC) and cybersecurity community at Delft University of Technology, I developed the open source algorithm Obfuscate(.) as a potential solution to this challenge. Obfuscate(.) is based on cryptographic masking where a raw plaintext dataset is transformed into a masked dataset through randomization. Computations are then performed on the masked data, and the results are later unmasked through de-randomization using a private key.
To get a basic understanding of how Obfuscate(.) works, imagine you hold two secret numbers (say 2 and 5). However, you have to share this data with a third party to calculate the sum. Rather than sharing the two numbers as plaintext, an algorithm generates a random key and masks it to randomized data (say 198.8 and 501.2). Next, the third party calculates the sum to be 700, and later de-randomizes the answer using the private key (100 in this case). As a result, it is practically impossible for an adversary to determine the actual numbers since there are infinite numbers of pairs that sum to 700. Then why not let users do the computation by themselves and share only the results? While this might work for the simple example mentioned above, the general consumer would need very large computational capabilities to estimate parameters in a complex power grid.
The essence of Obfuscate(.) lies in randomizing the data to prevent leakage of user’s metering data while still being able to accurately estimate the necessary parameters from this randomized dataset. What makes Obfuscate(.) stand out from a business perspective is that it is proven to be computationally efficient and practically scalable with every increase in the number of smart meters.
Currently, there are close to 200 million smart meters used in Europe, accounting for 72% of consumers. This number is expected to increase steadily in the coming years, which makes Obfuscate(.) increasingly viable as consumers would collectively save the total higher computation costs.
For more information on how Obfuscate(.) works, I invite you to read my journal paper, which was presented for the proceedings of Energy Informatics held September 26-27 this year in Salzburg, Austria. Of course, you can also contact me if you are interested in knowing more about the potential business case for Obfuscate(.) in the utilities sector.
Note: This blog is based on my master thesis work done in collaboration with the Delft Center for Systems and Control (DCSC) and Cyber Security community at Delft University of Technology, The Netherlands.