The irregular time series data comes in a formatted csv file with the following columns and data:
"Datum";"Zeit";"Stand HT";"Stand NT";"Gesamt kWh";"ø Woche kWh";"ø Tag kWh";"Level"
"11.10.14";"16:52";"15165,00";"15603,00";"57,0";"64,2";"9,2";"B"
"05.10.14";"11:40";"15140,00";"15571,00";"40,0";"60,5";"8,6";"B"
"30.09.14";"20:41";"15124,00";"15547,00";"93,0";"59,3";"8,5";"B"
"19.09.14";"21:22";"15085,00";"15493,00";"44,0";"76,3";"10,9";"C"
The information in the columns contains the following data:
Datum: Date
Zeit: Time
Stand HT: High tariff electricity meter reading
Stand NT: Low tariff electricity meter reading
Gesamt kWh: electricity consumption since last reading
ø Woche kWh: Average weekly electricity consumption
ø Tag kWh: Average daily electricity consumption
Level: Efficiency label
R's library zoo provides the infrastructure for my irregular time series. Lubridate makes it easier to work with date and time. Read.zoo gives me a couple of options to parse the csv file and to use the first (date) and the second column (time) as one single index:
library(zoo)
library(lubridate)
f <- function(d, t) parse_date_time(paste(d, t), "%d.%m.%Y %H:%M")
energy <- read.zoo("https://dl.dropboxusercontent.com/u/23293644/export-12-10-2014-20-45.csv",
header = TRUE, sep=";", dec=",",
index = 1:2,
FUN = function(d, t) parse_date_time(paste(d, t, sep = " "), "dmy hM")
)
The following command gives the subsets of the data we are interested in for further processing (High and low tariff electricity meter reading):
energy_subset <- subset(energy, select = c(Stand.HT, Stand.NT))
As can be seen here:
1: > energy
2: Stand.HT Stand.NT Gesamt.kWh ø.Woche.kWh ø.Tag.kWh Level
3: 2012-07-11 21:47:00 11084 11202 - - - -
4: 2013-01-27 16:17:00 12213 12343 2270,0 79,5 11,4 C
5: 2013-02-02 14:58:00 12266 12380 90,0 106,0 15,1 D
6: 2013-02-17 14:04:00 12355 12466 175,0 81,9 11,7 B
7: 2013-02-24 12:11:00 12421 12504 104,0 105,2 15,0 D
8: 2013-03-01 08:18:00 12460 12532 67,0 96,9 13,8 C
9:
10:
11: > energy_subset
12: Stand.HT Stand.NT
13: 2012-07-11 21:47:00 11084 11202
14: 2013-01-27 16:17:00 12213 12343
15: 2013-02-02 14:58:00 12266 12380
16: 2013-02-17 14:04:00 12355 12466
17: 2013-02-24 12:11:00 12421 12504
18: 2013-03-01 08:18:00 12460 12532
Now a monthly grid has to be created which starts at least a month before the first entry and ends at least 30 days after the last entry with the following commands:
monthly_grid_start <- as.Date(start(energy_subset) - days(16) - days(day(start(energy_subset) - 1))) + hours (1)
monthly_grid_end <- as.Date(end(energy_subset) + days(30)) + hours (1)
monthly_grid <- seq(monthly_grid_start, monthly_grid_end, by="1 month")
The monthly_grid is in the of the month (to have the average over the month) - at night when nobody ready a sample (I had some issue when the grid and a sample is at the same time):
[1] "2012-06-14 01:00:00 UTC" "2012-07-14 01:00:00 UTC" "2012-08-14 01:00:00 UTC"
[4] "2012-09-14 01:00:00 UTC" "2012-10-14 01:00:00 UTC" "2012-11-14 01:00:00 UTC"
[7] "2012-12-14 01:00:00 UTC" "2013-01-14 01:00:00 UTC" "2013-02-14 01:00:00 UTC"
[10] "2013-03-14 01:00:00 UTC" "2013-04-14 01:00:00 UTC" "2013-05-14 01:00:00 UTC"
[13] "2013-06-14 01:00:00 UTC" "2013-07-14 01:00:00 UTC" "2013-08-14 01:00:00 UTC"
[16] "2013-09-14 01:00:00 UTC" "2013-10-14 01:00:00 UTC" "2013-11-14 01:00:00 UTC"
[19] "2013-12-14 01:00:00 UTC" "2014-01-14 01:00:00 UTC" "2014-02-14 01:00:00 UTC"
[22] "2014-03-14 01:00:00 UTC" "2014-04-14 01:00:00 UTC" "2014-05-14 01:00:00 UTC"
[25] "2014-06-14 01:00:00 UTC" "2014-07-14 01:00:00 UTC" "2014-08-14 01:00:00 UTC"
[28] "2014-09-14 01:00:00 UTC" "2014-10-14 01:00:00 UTC"
The grid is then used to approximate the energy_subset
> monthly_energy <- na.approx(merge(xts(,monthly_grid),energy_subset))[monthly_grid]
> monthly_energy
Stand.HT Stand.NT
2012-07-14 01:00:00 11096.06 11214.19
2012-08-14 01:00:00 11271.26 11391.25
2012-09-14 01:00:00 11446.45 11568.30
2012-10-14 01:00:00 11616.00 11739.65
2012-11-14 01:00:00 11791.19 11916.71
2012-12-14 01:00:00 11960.74 12088.05
2013-01-14 01:00:00 12135.93 12265.11
2013-02-14 01:00:00 12333.92 12445.63
2013-03-14 01:00:00 12549.50 12602.58
2013-04-14 01:00:00 12757.15 12784.74
2013-05-14 01:00:00 12917.62 12958.23
2013-06-14 01:00:00 13075.25 13123.25
Then a new column "Total" is added which sums up the Stand.HT and Stand.NT
> monthly_energy <- transform(monthly_energy, Total = Stand.HT + Stand.NT)
> monthly_energy
Stand.HT Stand.NT Total
2012-07-14 01:00:00 11096.06 11214.19 22310.25
2012-08-14 01:00:00 11271.26 11391.25 22662.50
2012-09-14 01:00:00 11446.45 11568.30 23014.76
2012-10-14 01:00:00 11616.00 11739.65 23355.65
2012-11-14 01:00:00 11791.19 11916.71 23707.90
me computes the difference between two subsequent readings - so we get the usage.
me_lag is the same as me - but lagged by 12 month (1 year) - so can compare against the previous year.
me_save computes the relative savings against the previous year
m is the merge of all this data
me <- diff(monthly_energy)
me_lag <- lag(me, -12)
me_save <- ((me - me_lag) / me) * 100
m <- zoo(merge(me, me_lag, me_save))
the following command writes it to a *.csv file (in case somebody needs to analyze the data with Excel
write.zoo(m, file = "c:/temp/test.csv", index.name = "Index", col.names = TRUE, na="", quot = TRUE, sep=";", dec=",")
# ?plot.zoo
my.panel <- function(x, y, ..., pf = parent.frame()) {
fmt <- "%b-%y" # format for axis labels
lines(x, y, ...)
# if bottom panel
if (with(pf, length(panel.number) == 0 ||
panel.number %% nr == 0 || panel.number == nser)) {
# create ticks at x values and then label every second tick
axis(side = 1, at = x, labels = FALSE)
ix <- seq(1, length(x), 2)
labs <- format(x, fmt)
axis(side = 1, at = x[ix], labels = labs[ix], tcl = -0.7, cex.axis = 0.7)
}
}
plot(m, panel = my.panel, xaxt = "n", screens=c(1,2,3,1,2,3), lty=c(1,1,1,3,3,3), lwd= c(2,2,2, 0.5, 0.5, 0.5), xlab= "Date", main = "[kWh/month]", ylab=colnames(m))
creates the following plot which compars the total and the two tarrifs against the values from the previous years:
The following code
my.panel2 <- function(x, y, ..., pf = parent.frame()) {
fmt <- "%b-%y" # format for axis labels
lines(x, y, ...)
# if bottom panel
# create ticks at x values and then label every second tick
axis(side = 1, at = x, labels = FALSE)
ix <- seq(1, length(x), 2)
labs <- format(x, fmt)
axis(side = 1, at = x[ix], labels = labs[ix], tcl = -0.7, cex.axis = 0.7)
}
plot (me_save$Total, panel = my.panel2, xaxt = "n", xlab= "Date", main = "Energy saving [%] compared with previous year", ylab="Total")
shows the energy savings compared to the previous year in a simpler diagram:

No comments:
Post a Comment