Short-term load forecasting is one of important bases to ensure the stability and economic operation of electrical power system. In order to improve its accuracy, a model based on attention gated recurrent unit (Attention-GRU) network is proposed in this paper. The gated recurrent unit (GRU) network is capable of considering timing and non-linear characteristics of load data simultaneously to obtain higher forecasting accuracy, which cannot be achieved by the prediction methods based on statistical analysis and traditional machine learning. The introduction of attention mechanism can highlight the critical input features to improve forecasting accuracy. According to the results of simulation experiment using the actual load and electricity price data from a certain region of Australia, the presented model has higher forecasting accuracy and ideal efficiency compared with the other models based on gated recurrent unit (GRU), long short-term memory (LSTM) and back-propaganda (BP) neural network.