Overview

The proposed solution would be having a dedicated configuration (i.e.: instanceId) used to identify the instance.

Implementation

Setup

The instanceId could be added in the gerrit.config under the gerrit section. It could be automatically initialised when not present at service startup, similarly to how the serverId is generated today.

The id could be a UUIDv4 to avoid the need of a central authority for the generation of a unique ID.

Propagation

Each event will need to be labelled with the instanceId it has been generated from. The com.google.gerrit.server.events.Event will need to be modified to accommodate the new parameter.

Exposing the identifier

Gerrit will have to expose the local instanceId so plugins can use it. This can be done with a new @InstanceId annotation.

Current consumers compatibility

Gerrit Jenkins Trigger Plugin

The addition of an instanceId in the Events won’t affect the Gerrit Jenkins Trigger Plugin, since the DTO used is really lax and decoupled from Gerrit. Additional fields added to the Event will just be ignored.

Other plugins

I used this query and this other one to assess the plugins that might be impacted by the change of the com.google.gerrit.server.events.Event.

Plugins that would need adaptation with this change will be the multi-site and the events-broker since they will have to start using the core instanceId instead of the multi-site definition of instanceId.

The events-log plugin, used for example by the Gerrit Jenkins Trigger Plugin, might also need extra work to store the extra field in the database. A database schema migration will also be needed.

The high-availability won’t need to be modified, but there will be the chance of simplifying it and make it more resilient, since it won’t have to store the forwarded events in the thread local storage anymore.

Use case fulfilment

This solution would satisfy the use cases presented in the description of the problem.

Having the instanceId will allow to inspect the event payload to understand its origin.

Let’s analyse the by-products:

Avoid loops: events with the same instanceId of the consuming instance can be filtered out, avoiding creation of loops. Plugins like the high-availability can be simplified avoiding to store in memory the forwarded events to keep track of the origin.
Priority routing: logic can be implemented around instanceId of an event to decide which one to consume first depending on its origin. Discoverability of the other instances in the cluster is not part of this work.
Troubleshooting: instanceId can be dumped in log if needed. This is already possible only the extra field will be present as well.
Preventing events duplication: if we take the case of the Slack plugin, notifications can be sent only from the instance producing the event (i.e.: instanceId of the event == instanceId of the Gerrit instance). This will prevent multiple notifications. This scenario is the dual of the first one, Avoid loops. In this case instance producing an event wants to perform some operations when consuming it, in the other case we just want a no-ops.