Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
176 changes: 176 additions & 0 deletions dev/modules/net_telnet.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
# Net::Telnet Support for PerlOnJava

## Status: Phase 1 Complete — 3/3 CPAN tests pass, regex octal range bug fixed

**Branch**: `feature/net-telnet-support`
**Date started**: 2026-04-06

## Background

Net::Telnet is a CPAN module for automating Telnet sessions and TCP connections.
It's pure Perl (no XS dependencies) and relies on socket I/O, 4-arg `select()`,
`alarm()`/`$SIG{ALRM}` for timeouts, and `sysread`/`syswrite` for non-buffered I/O.

Running `./jcpan -j 4 -t Net::Telnet` installs the module and passes all 3 CPAN
tests (select.t), but using the module at runtime crashes on a regex octal escape
bug when Net::Telnet's internal telnet-option stripping code is invoked.

## Test Command

```bash
./jcpan -j 4 -t Net::Telnet
```

## Current State (before fixes)

### Dependency Test Summary

| Module | Test Results | Blocker |
|--------|-------------|---------|
| **Net::Telnet** | 3/3 CPAN tests pass | Runtime crash on `[\177-\237]` regex |
| **Socket** | OK | Already supported |
| **IO::Socket::INET** | OK | Already supported |

### Runtime Failure

Net::Telnet uses this regex pattern at line 2994 of Telnet.pm:

```perl
$s =~ s/[\000-\037,\177-\237]//g;
```

This crashes with:
```
Invalid [] range "7-2" in regex; marked by <-- HERE in m/[\000-\037,\177-\237]/
```

### Root Cause Analysis

#### Bug 1: 3-digit octal escapes in character class ranges produce wrong output (CRITICAL)

- **File**: `RegexPreprocessorHelper.java`, lines 766-770 (character class handler)
and lines 423-426 (outside-class handler)
- **Symptom**: `[\177-\237]` errors with `Invalid [] range "7-2"`
- **Root cause**: The 3-digit octal handler (`octalValue <= 255 && octalLength == 3`)
only appends `\0` + the first digit (e.g., `\01` for `\177`) and does NOT advance
`offset` past the remaining two digits. The remaining digits (`77`) are left for
the main loop, which processes them as literal characters. So `\177` becomes
`\01` + literal `7` + literal `7` instead of `\x{7F}` (char 127).

The range validation code then sees literal `7` (char 55) as the range start and
`\2` (char 50, from similarly broken `\237`) as the range end. Since 55 > 50,
it reports `Invalid [] range "7-2"`.

- **Impact**: Any regex with `\1nn`-`\3nn` octal escapes in character class ranges
fails. This affects Net::Telnet's telnet-option stripping, and likely other modules
that use control character ranges.

#### Bug 2: Range endpoint validation doesn't parse bare octal escapes (HIGH)

- **File**: `RegexPreprocessorHelper.java`, lines 556-589 (range `-` handler)
- **Symptom**: Even after fixing Bug 1's output, the range validator only handles
`\x{...}` and `\o{...}` as range endpoints, not bare octal escapes like `\237`.
It reads only `\2` (2 chars) instead of `\237` (4 chars), getting the wrong
code point for comparison.
- **Root cause**: The range endpoint parser at line 557-558 sets `rangeEndCharCount = 2`
for any `\X` escape, without special-casing multi-digit octals.
- **Impact**: Range validation would give false "invalid range" errors even if the
octal output were correct.

#### Bug 3: Dead code in octal handlers (LOW)

- **File**: `RegexPreprocessorHelper.java`, lines 431-434 and 776-780
- **Symptom**: The branches `c2 >= '1' && c2 <= '3' && octalLength == 3` can never
be reached because the preceding branch `octalValue <= 255 && octalLength == 3`
catches all the same cases.
- **Impact**: No runtime impact, just dead code.

## Implementation Plan

### Phase 1: Fix regex octal escape handling

**Files to modify:**

1. `src/main/java/org/perlonjava/runtime/regex/RegexPreprocessorHelper.java`
- **Lines 766-770** (character class 3-digit octal): Convert to `\x{hex}` format
and advance `offset += octalLength - 1` (same as the `> 255` branch)
- **Lines 423-426** (outside-class 3-digit octal): Same fix — convert to hex and
advance offset
- **Lines 556-589** (range `-` validation): Add bare octal escape parsing so the
range validator correctly computes the code point for `\NNN` endpoints
- **Lines 431-434 and 776-780**: Remove dead code branches

**Expected impact:** `[\177-\237]` and similar patterns work correctly. Net::Telnet
runtime operations succeed.

### Phase 2: (Future) Verify comprehensive Net::Telnet functionality

All core PerlOnJava infrastructure is already in place:

| Feature | Status |
|---------|--------|
| `socket()` / `connect()` | Fully implemented (NIO-based) |
| `sysread()` / `syswrite()` | Fully implemented |
| `send()` / `recv()` | Partial (flags ignored, adequate for TCP) |
| `getpeername()` / `getsockname()` | Fully implemented |
| 4-arg `select()` | Fully implemented (NIO Selector) |
| `alarm()` / `$SIG{ALRM}` | Fully implemented |
| `IO::Socket::INET` | Present and working |

**Known limitation:** `DESTROY` is not implemented in PerlOnJava. Net::Telnet
objects won't auto-close sockets when they go out of scope. Users should call
`$t->close()` explicitly.

## Test Verification

```bash
# Build
make

# CPAN test
./jcpan -j 4 -t Net::Telnet

# Direct regex test
./jperl -e '"x" =~ /[\177-\237]/ and print "ok\n"'

# Comprehensive runtime test
./jperl -e '
use Net::Telnet;
use IO::Socket::INET;
my $srv = IO::Socket::INET->new(LocalAddr=>"127.0.0.1",LocalPort=>0,Proto=>"tcp",Listen=>1);
my $port = $srv->sockport;
my $t = Net::Telnet->new(Timeout=>3, Errmode=>"return");
$t->open(Host=>"127.0.0.1", Port=>$port);
my $c = $srv->accept;
print $c "login: ";
$c->flush;
my ($pre,$match) = $t->waitfor(String=>"login: ", Timeout=>2);
print "waitfor: ", defined($match) ? "OK" : "FAIL", "\n";
$t->print("user");
my $buf; $c->sysread($buf, 1024);
print "roundtrip: ", $buf =~ /user/ ? "OK" : "FAIL", "\n";
$c->close; $t->close; $srv->close;
'
```

## Progress Tracking

### Current Status: Phase 1 Complete

### Completed Phases
- [x] Phase 1: Fix regex octal escape handling (2026-04-06)
- Fixed 3-digit octal in character class handler (lines 766-770): convert to `\x{hex}`
- Fixed 3-digit octal in outside-class handler (lines 423-426): convert to `\x{hex}`
- Added bare octal parsing to range endpoint validator (lines 556-589)
- Removed dead code branches (lines 431-434, 776-780)
- Files changed: `RegexPreprocessorHelper.java`
- Results: 3/3 CPAN tests pass, 14/14 runtime tests pass, 15/15 regex tests pass

### Next Steps
- Phase 2 (future): Run broader CPAN module tests that use octal ranges
- No further fixes needed for Net::Telnet support

## Related Documents
- `dev/modules/poe.md` — POE uses 4-arg select() extensively
- `dev/modules/lwp_useragent.md` — LWP uses socket I/O
- AGENTS.md — Pre-existing regex octal escape issue documented
Original file line number Diff line number Diff line change
Expand Up @@ -421,17 +421,16 @@ static int handleEscapeSequences(String s, StringBuilder sb, int c, int offset,
sb.append(String.format("\\x{%X}", octalValue));
offset += octalLength - 1; // -1 because caller will increment
} else if (octalValue <= 255 && octalLength == 3) {
// Standard 3-digit octal, prepend 0 for Java
sb.append('0');
sb.append(Character.toChars(c2));
// Standard 3-digit octal - convert to hex for Java regex
// Using \x{hex} avoids issues with \0mnn parsing and ensures
// all 3 digits are consumed (e.g., \177 → \x{7F})
sb.setLength(sb.length() - 1); // Remove the backslash
sb.append(String.format("\\x{%X}", octalValue));
offset += octalLength - 1; // -1 because caller will increment
} else if (c2 == '0' && octalLength == 1) {
// Single \0 becomes \00
sb.append('0');
sb.append('0');
} else if (c2 >= '1' && c2 <= '3' && octalLength == 3) {
// 3-digit octal starting with 1-3, prepend 0
sb.append('0');
sb.append(Character.toChars(c2));
} else {
// Short octal or single digit, pass through
sb.append(Character.toChars(c2));
Expand Down Expand Up @@ -585,6 +584,26 @@ static int handleRegexCharacterClassEscape(int offset, String s, StringBuilder s
nextChar = -1;
}
}
} else if (nextChar >= '0' && nextChar <= '7') {
// Parse bare octal escape (\NNN) as range endpoint
// e.g., \237 → octal 237 = 159
int octalVal = nextChar - '0';
int digits = 1;
for (int k = 2; k <= 3 && nextPos + k < length; k++) {
int d = s.codePointAt(nextPos + k);
if (d >= '0' && d <= '7') {
octalVal = octalVal * 8 + (d - '0');
digits++;
} else {
break;
}
}
if (digits >= 2) {
// Multi-digit octal: use computed value
nextChar = octalVal;
rangeEndCharCount = 1 + digits; // backslash + digits
}
// Single digit \N stays as-is (rangeEndCharCount = 2)
}
}

Expand Down Expand Up @@ -764,20 +783,17 @@ static int handleRegexCharacterClassEscape(int offset, String s, StringBuilder s
offset += octalLength - 1; // -1 because outer loop will increment
lastChar = octalValue;
} else if (octalValue <= 255 && octalLength == 3) {
// Standard 3-digit octal, prepend 0 for Java
sb.append('0');
sb.append(Character.toChars(c2));
// Standard 3-digit octal - convert to hex for Java regex
// Using \x{hex} avoids issues with \0mnn parsing and ensures
// correct range validation (e.g., [\177-\237])
sb.append(String.format("x{%X}", octalValue));
offset += octalLength - 1; // -1 because outer loop will increment
lastChar = octalValue;
} else if (c2 == '0' && octalLength == 1) {
// Single \0 becomes \00
sb.append('0');
sb.append('0');
lastChar = 0;
} else if (c2 >= '1' && c2 <= '3' && octalLength == 3) {
// 3-digit octal starting with 1-3, prepend 0
sb.append('0');
sb.append(Character.toChars(c2));
lastChar = octalValue;
} else {
// Short octal (1-2 digits) — prepend 0 for Java
// In Perl, \1-\7 inside [] are octal; in Java, \N is a backreference
Expand Down
Loading