multiple remote addresses in stats

**The problem:** in the `stats` topic, remote addresses are being reported with multiple IPs in the new dev server instance at app.practable.io/dev running on GCE behind a load balancer (this also affects relay), e.g. `pend18` returns
```
    "remote_address": "129.215.182.72, 34.117.155.39, 35.191.16.24",
```

**Why is this happening?**

In [crossbar.go](https://github.com/practable/jump/blob/main/internal/crossbar/crossbar.go#L623) we extract the remote address from a header

```
remoteAddr:     r.Header.Get("X-Forwarded-For"),
```

The three IP addresses we got in the example above are 
129.215.182.72 is w7545.see.ed.ac.uk, the address we want
34.117.155.39 belongs to google (domain: 39.155.117.34.bc.googleusercontent.com)
35.191.16.24 belongs to google (domain: 24-16-191-35.1e100.net)

The two google addresses are added to the header by the load balancer setup.

This is consistent with the [expected behaviour](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Forwarded-For) where the order of the IP addresses is specified as client first, then each successive proxy.

```
X-Forwarded-For: <client>, <proxy1>, <proxy2>
```

The combinations of proxy addresses present are different for different streams sent from the same experiment 

relay stats
```
   "topic": "test00-st-data",
<snip>
    "remoteAddr": "92.239.205.252, 34.117.155.39, 35.191.12.183",
<snip>
    "topic": "test00-st-video",
<snip>
    "remoteAddr": "92.239.205.252, 34.117.155.39, 35.191.19.106",
```
jump stats
```
    "remote_address": "92.239.205.252, 34.117.155.39, 35.191.19.101",
<snip>
    "topic": "test00",
```

If we select only the left-most IP address, without regard for the number of proxies included, the client address could be spoofed by additional untrusted proxies.
If we select only the right-most trusted IP address (the 3rd from right IP), then we could be selecting a proxy that would be the same for multiple experiments and multiple clients, e.g. an institutional proxy, and so we would be unable to discern an experiment from that institution versus a user (potentially), leading to incorrect status information (e.g, thinking a viable client stream is in fact a viable experiment stream, when the experiment is down, leading to a false positive, or vice versa)
We'd also need to configure the service with information about the number of load balancers -doable, but ideally avoided.

Either way,  it seems we cannot use the circumstantial approach currently proposed in [status](https://github.com/practable/status) to identify experiment stream connects from user stream connections using the IP address of the connection to the host's jump topic (the one without the slash). 

Experiment vs user cannot be inferred from the read/write privileges because these can be true for both experiments and users. 

Instead, it seems that we should include information on whether connection is from experiment or user in the trusted JWT token for the connection.

The jump JWT already includes client/host scopes but these are not reported in stats (only the read/write)

**Proposed solution:** report client/host scope in stats. 

[edit: fix bold formatting]



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multiple remote addresses in stats #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

multiple remote addresses in stats #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions