Test Case 17

Fault tolerance/ recovery of a multi-aggregator dispatch mechanism

Identification

ID

17

Project

ERIGrid 2.0

Date

5/2/2021

Test Case Definition

Name of the Test Case

Fault tolerance/ recovery of a multi-aggregator dispatch mechanism

Narrative

In a section of an electrical distribution grid, multiple (>=2) aggregators operate independently of each other. Each of the aggregators maintains a portfolio of flexible DER units under contract. The contract provides for on-demand delivery of a grid service (e.g. load reduction/production increase).

It is assumed that the primary customer for the grid service in question is the local DSO operating the grid section. It is also assumed that a fair mechanism exists for matching supply (aggregators) and demand (DSO). This mechanism could e.g. be a flexibility market - but the exact workings are not relevant to the test case. For the sake of simplicity, it is assumed that the unknown mechanism has produced a merit-order list of the services offered by the different aggregators, before the start of the test. A dispatch unit is tasked with continuously matching the service demand indicated by the DSO using the merit order list, and to submit service activation requests to the appropriate aggregators.

At some point in time after the start of the test, a communication fault disrupts communication between the dispatch unit and one of the aggregators. To ensure that the service demand is met, the dispatch unit must decide whether the fault is temporary or permanent, and arrange for meeting the service demand with the remaining aggregator(s).

It is assumed that an aggregator affected by a communication fault will continue to function, i.e. service definitions agreed with the dispatch unit will continue to be delivered. However, due to the lack of updates from the dispatch unit, these services will eventually diverge from the evolving needs of the grid.

Function(s) under Investigation (FuI)

Communication fault detection and impact mitigation.

A communication fault in the context of this test case is defined as the inability of the dispatch unit to communicate with one or more aggregators, caused by a disruption of the communication link. Implementations of the two functions will typically require functionality both on the dispatch unit as well as on the aggregator side.

Object under Investigation (OuI)

Aggregator dispatch unit

Domain under Investigation (DuI)
  • Electrical
  • ICT
Purpose of Investigation (PoI)

Characterisation of (the performance of) system recovery from a permanent communication failure.

System under Test (SuT)

The test system consists of an electrical grid section, energy resources, aggregators, a dispatch unit, a service requester (DSO) and an ICT command and control infrastructure. The electrical grid section is a part of a LV and/or MV distribution grid (e.g. 0.4kV and 10kV) with a number of flexible DER units distributed across one or multiple feeders. The ICT infrastructure consists of three separate types of IT systems (dispatch unit, aggregator and DER controller) as well as the communication links between these entities.

Functions under Test (FuT)
  • Communication fault detection and impact mitigation (FuI)
  • Coordinated congestion management, consisting of an algorithm for congestion management as well as the communication between units needed to implement the algorithm (i.e. between unit controllers and aggregators and between aggregators and dispatch unit)
  • Aggregator internal dispatch mechanism, consisting of portfolio management and unit dispatch
  • Unit controller functionality, consisting of local flexibility calculation and management
  • A dispatch mechanism for aggregators, executing on the dispatch unit.
  • A congestion detection method, emulating the determination of service requirements by a DSO.
Test criteria (TCR)
  • Time to detection of a nonresponsive aggregator
  • Time to full service restoration
Target Metrics (TM)
  • Time elapsed from the occurrence of a communication fault to the classification of an aggregator as nonresponsive by the dispatch unit [s]
  • Time elapsed from the occurrence of a communication fault to the restoration of a stable steady state in which the entire service demand as requested by the DSO is met by the aggregators [s].
Variability Attributes (VA)
  • Controllable
    • Frequency of service change requests from dispatch unit to aggregators [1/10s to 1/10min]
    • Duration of communication failure [100ms to permanent]
    • Variability of baseload profile [0 to 100% of mean baseload]
Quality Attributes (QA)
  • Confidence in the accuracy of the time measurements (measurement error <= +/- 10% of measured value)

Qualification Strategy

The system will be characterized in two different domains:

  • Isolated in the ICT domain: How well does the multi-aggregator system recover from a communication fault?
  • The overall performance of the system in the electrical power domain: What is the impact of the fault recovery in the ICT domain on the ability of the multi-aggregator system to deliver a power system service?

This will be achieved by a sequence of two tests:

  • One test to determine the performance of a multi-aggregator system without any communication fault recovery mechanism. This will establish a baseline for overall system performance in the electrical power system domain.
  • A second test where a recovery mechanism is in place. This will allow both characterization objectives to be achieved by (a) measuring the performance of the fault recovery mechanism in the ICT domain, and (b) comparing the performance of the entire multi-aggregator system to the baseline results.

Test Specification TC17.TS1

Baseline test

Test Specification TC17.TS2

Characterization of recovery from failure