数据集:
electricity_load_diagrams
任务:
计算机处理:
monolingual大小:
1K<n<10K语言创建人:
found批注创建人:
no-annotation源数据集:
original许可:
This dataset contains hourly kW electricity consumption time series of 370 Portuguese clients from 2011 to 2014.
The dataset has the following configuration parameters:
For example, you can specify your own configuration different from those used in the papers as follows:
load_dataset("electricity_load_diagrams", "uci", rolling_evaluations=10)
Notes:
Data set has no missing values. The raw values are in kW of each 15 min interval and are resampled to hourly frequency. Each time series represent one client. Some clients were created after 2011. In these cases consumption were considered zero. All time labels report to Portuguese hour, however all days contain 96 measurements (24*4). Every year in March time change day (which has only 23 hours) the values between 1:00 am and 2:00 am are zero for all points. Every year in October time change day (which has 25 hours) the values between 1:00 am and 2:00 am aggregate the consumption of two hours.
A sample from the training set is provided below:
{
'start': datetime.datetime(2012, 1, 1, 0, 0),
'target': [14.0, 18.0, 21.0, 20.0, 22.0, 20.0, 20.0, 20.0, 13.0, 11.0], # <= this target array is a concatenated sample
'feat_static_cat': [0],
'item_id': '0'
}
We have two configurations uci and lstnet , which are specified as follows.
The time series are resampled to hourly frequency. We test on 7 rolling windows of prediction length of 24.
The uci validation therefore ends 24*7 time steps before the end of each time series. The training split ends 24 time steps before the end of the validation split.
For the lsnet configuration we split the training window so that it is 0.6-th of the full time series and the validation is 0.8-th of the full time series and the last 0.2-th length time windows is used as the test set of 7 rolling windows of the 24 time steps each. Finally, as in the LSTNet paper, we only consider time series that are active in the year 2012--2014, which leaves us with 320 time series.
For this univariate regular time series we have:
Given the freq and the start datetime, we can assign a datetime to each entry in the target array.
| name | train | unsupervised | test |
|---|---|---|---|
| uci | 370 | 2590 | 370 |
| lstnet | 320 | 2240 | 320 |
The Electricity Load Diagrams 2011–2014 Dataset was developed by Artur Trindade and shared in UCI Machine Learning Repository. This dataset covers the electricity load of 370 substations in Portugal from the start of 2011 to the end of 2014 with a sampling period of 15 min. We will resample this to hourly time series.
Research and development of load forecasting methods. In particular short-term electricity forecasting.
This dataset covers the electricity load of 370 sub-stations in Portugal from the start of 2011 to the end of 2014 with a sampling period of 15 min.
Initial Data Collection and Normalization[More Information Needed]
Who are the source language producers?[More Information Needed]
[More Information Needed]
Who are the annotators?[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
@inproceedings{10.1145/3209978.3210006,
author = {Lai, Guokun and Chang, Wei-Cheng and Yang, Yiming and Liu, Hanxiao},
title = {Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks},
year = {2018},
isbn = {9781450356572},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3209978.3210006},
doi = {10.1145/3209978.3210006},
booktitle = {The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval},
pages = {95--104},
numpages = {10},
location = {Ann Arbor, MI, USA},
series = {SIGIR '18}
}
Thanks to @kashif for adding this dataset.