Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@
import com.linbit.linstor.storage.StorageException;
import com.linbit.linstor.storage.data.adapter.drbd.DrbdRscData;
import com.linbit.linstor.storage.interfaces.categories.resource.AbsRscLayerObject;
import com.linbit.linstor.core.types.TcpPortNumber;
import com.linbit.linstor.storage.kinds.DeviceLayerKind;
import com.linbit.linstor.storage.utils.LayerUtils;
import com.linbit.linstor.tasks.AutoDiskfulTask;
Expand Down Expand Up @@ -81,6 +82,7 @@
import java.util.Map;
import java.util.Map.Entry;
import java.util.Set;
import java.util.TreeSet;

import org.reactivestreams.Publisher;
import reactor.core.publisher.Flux;
Expand Down Expand Up @@ -855,10 +857,16 @@ private Flux<ApiCallRc> prepareDiskAddition(
rsc.getNode().getName().displayValue,
rscDfn.getName().displayValue
);

/*
* We also have to remove the currently diskless DrbdRscData and free up the node-id as now we must
* use the shared resource's node-id. We still need to preserve TCP ports though.
*/
copyDrbdTcpPortsIfExists(rsc, payload);
}
else
{
copyDrbdNodeIdIfExists(rsc, payload);
copyDrbdSettings(rsc, payload);
}

removeLayerData(rsc);
Expand Down Expand Up @@ -1038,10 +1046,20 @@ private void ensureAllPeersHavePeerSlotLeft(ResourceDefinition rscDfnRef)
}

/**
* Although we need to rebuild the layerData as the layerList might have changed, if we do not
* deactivate (i.e. down) the current resource, we need to make sure that deleting DrbdRscData
* and recreating a new DrbdRscData ends up with the same node-id as before.
* Copies DRBD settings (node-id and TCP ports) from the existing DrbdRscData into the payload
* before removeLayerData() deletes them. This ensures that recreated DrbdRscData ends up with
* the same node-id and TCP ports as before.
*
* TCP ports must be preserved because if the satellite misses the update (e.g. due to controller
* restart or connectivity issues), it will keep the old ports while peers receive the new ones,
* causing DRBD connections to fail with StandAlone state.
*/
private void copyDrbdSettings(Resource rsc, LayerPayload payload) throws ImplementationError
{
copyDrbdNodeIdIfExists(rsc, payload);
copyDrbdTcpPortsIfExists(rsc, payload);
}

private void copyDrbdNodeIdIfExists(Resource rsc, LayerPayload payload) throws ImplementationError
{
Set<AbsRscLayerObject<Resource>> drbdRscDataSet = LayerRscUtils.getRscDataByLayer(
Expand All @@ -1060,6 +1078,28 @@ private void copyDrbdNodeIdIfExists(Resource rsc, LayerPayload payload) throws I
}
}

private void copyDrbdTcpPortsIfExists(Resource rsc, LayerPayload payload) throws ImplementationError
{
Set<AbsRscLayerObject<Resource>> drbdRscDataSet = LayerRscUtils.getRscDataByLayer(
getLayerData(apiCtx, rsc),
DeviceLayerKind.DRBD
);
if (!drbdRscDataSet.isEmpty())
{
DrbdRscData<Resource> drbdRscData = (DrbdRscData<Resource>) drbdRscDataSet.iterator().next();
Collection<TcpPortNumber> tcpPorts = drbdRscData.getTcpPortList();
if (tcpPorts != null && !tcpPorts.isEmpty())
{
Set<Integer> portInts = new TreeSet<>();
for (TcpPortNumber port : tcpPorts)
{
portInts.add(port.value);
}
payload.drbdRsc.tcpPorts = portInts;
}
}
}

private List<DeviceLayerKind> removeLayerData(Resource rscRef)
{
List<DeviceLayerKind> layerList;
Expand Down Expand Up @@ -1314,15 +1354,17 @@ private Flux<ApiCallRc> finishOperationInTransaction(
/*
* We also have to remove the possible meta-children of previous StorageRscData.
* LayerData will be recreated with ensureStackDataExists.
* However, we still need to remember our node-id if we had / have DRBD in the list
* However, we still need to remember our DRBD settings if we had / have DRBD in the list
*/
copyDrbdNodeIdIfExists(rsc, payload);
copyDrbdSettings(rsc, payload);
layerList = removeLayerData(rsc);
}
else
{
markDiskAdded(rsc);
ctrlLayerStackHelper.resetStoragePools(rsc);
// Pass false to skip redundant ensureStackDataExists call within resetStoragePools,
// since we call it explicitly below with the correct payload
ctrlLayerStackHelper.resetStoragePools(rsc, false);
}
ctrlLayerStackHelper.ensureStackDataExists(rsc, layerList, payload);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -264,6 +264,11 @@ rscRef, payload, layerList, new ChildResourceData(""), null
}

public void resetStoragePools(Resource rscRef)
{
resetStoragePools(rscRef, true);
}

public void resetStoragePools(Resource rscRef, boolean callEnsureStackDataExistsRef)
{
try
{
Expand All @@ -278,7 +283,10 @@ public void resetStoragePools(Resource rscRef)
rscDataToProcess.addAll(rscData.getChildren());
}

ensureStackDataExists(rscRef, null, new LayerPayload());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see how this is redundant with the call in CtrlRscToggleDiskApiCallHandler line 1327 . However here are my thoughts to this:

If anything, it is the call in CtrlRscToggleDiskApiCallHandler line 1327 that is the redundant one, since the resetStoragePools method could technically be called from other places in the future. Removing it as you wanted would add the risk that this future new code/feature forgets to call ensureStackDataExists which will leave a corrupt layer-tree within LINSTOR.

That would mean that the line 1327 of CtrlRscToggleDiskApiCallHandler could be moved into the end of the then-case of if (removeDisk) (i.e. currently between lines 1320 and 1321). However, with the same argument as before, adding some future new branch in the finishOperationInTransaction could also lead to forgetting to call ensureStackDataExists.

So here is what I'd rather suggest:

  • Either keep the line 280 in CtrlRscLayerDataFactory to ensure that calling this method cannot be forgotten and instead just add a comment before CtrlRscToggleDiskApiCallHandler line 1327 that we simply accept this redundancy
  • Make calling ensureStackDataExists in CtrlRscLayerDataFactory line 280 optional with something like this:
public void resetStoragePools(Resource rscRef)
{
    resetStoragePools(rscRef, true);
}

public void resetStoragePools(Resource rscRef, boolean callEnsureStackDataExistsRef)
{
    ... 
    if (callEnsureStackDataExistsRef)
    {
        ensureStackDataExists(...);
    }
    ...
}

And change the call in CtrlRscToggleDiskApiCallHandler line 1325 to ctrlLayerStackHelper.resetStoragePools(rsc, false);

The latter approach should prevent accidentally forgetting to call this in the future (worst case we add again a redundancy but never miss calling ensureStackDataExists entirely) but also gives the caller enough control to prevent redundancy.

if (callEnsureStackDataExistsRef)
{
ensureStackDataExists(rscRef, null, new LayerPayload());
}
}
catch (AccessDeniedException exc)
{
Expand Down