|
| 1 | +# CPAN Client Support for PerlOnJava |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This document analyzes what's needed to run CPAN.pm (or alternatives) on PerlOnJava. |
| 6 | + |
| 7 | +## Current Status |
| 8 | + |
| 9 | +CPAN.pm has deep dependencies that make it challenging to port. The main blocker is `Safe`/`Opcode` which requires access to Perl's internal opcode system. |
| 10 | + |
| 11 | +--- |
| 12 | + |
| 13 | +## CPAN.pm Dependency Analysis |
| 14 | + |
| 15 | +### Available (Already Working) |
| 16 | + |
| 17 | +| Module | Status | |
| 18 | +|--------|--------| |
| 19 | +| File::Spec, File::Basename, File::Copy, File::Find, File::Path, File::Temp | ✅ | |
| 20 | +| Text::ParseWords, Text::Wrap | ✅ | |
| 21 | +| Config, Carp, Cwd, Exporter, Fcntl | ✅ | |
| 22 | +| FileHandle, IO::File, IO::Handle | ✅ | |
| 23 | +| HTTP::Tiny, Compress::Zlib | ✅ | |
| 24 | +| Digest::MD5, Digest::SHA, MIME::Base64 | ✅ | |
| 25 | +| YAML, JSON, Term::ReadLine | ✅ | |
| 26 | + |
| 27 | +### Critical Missing Modules |
| 28 | + |
| 29 | +| Module | Status | Complexity | Notes | |
| 30 | +|--------|--------|------------|-------| |
| 31 | +| **Safe** | ❌ Missing | High | Sandbox/compartment module - requires Opcode | |
| 32 | +| **Opcode** | ❌ Missing | Very High | Core opcodes restriction - deeply tied to Perl internals | |
| 33 | +| **DirHandle** | ✅ Done | Low | OO interface to opendir/readdir - imported via sync.pl | |
| 34 | +| **Sys::Hostname** | ✅ Done | Low | `hostname()` function - SysHostname.java XS module | |
| 35 | +| **ExtUtils::MakeMaker** | ❌ Missing | Very High | Build system - huge module with many dependencies | |
| 36 | +| **LWP::UserAgent** | ❌ Missing | Medium | Web client (HTTP::Tiny exists as alternative) | |
| 37 | +| **Archive::Tar** | ✅ Done | Medium | Imported via sync.pl | |
| 38 | +| **Archive::Zip** | ❌ Missing | Medium | Zip handling - Java has built-in support | |
| 39 | +| **Net::FTP** | ✅ Done | Medium | Imported via sync.pl | |
| 40 | +| **IPC::Open3** | ❌ Missing | Medium | Process I/O - needs Java ProcessBuilder | |
| 41 | +| **IO::Socket** | ✅ Done | Medium | Imported via sync.pl | |
| 42 | +| **Dumpvalue** | ✅ Done | Low | Imported via sync.pl | |
| 43 | + |
| 44 | +### Built-in Functions Missing |
| 45 | + |
| 46 | +| Function | Status | Notes | |
| 47 | +|----------|--------|-------| |
| 48 | +| `flock()` | ✅ Implemented | File locking - using java.nio.channels.FileLock | |
| 49 | + |
| 50 | +--- |
| 51 | + |
| 52 | +## Import Strategy via sync.pl |
| 53 | + |
| 54 | +The `dev/import-perl5/sync.pl` script can import pure Perl modules from the perl5 source tree. |
| 55 | + |
| 56 | +### Quick Wins - Add to config.yaml |
| 57 | + |
| 58 | +These modules can be imported directly: |
| 59 | + |
| 60 | +```yaml |
| 61 | +# DirHandle - OO directory handle interface |
| 62 | +- source: perl5/lib/DirHandle.pm |
| 63 | + target: src/main/perl/lib/DirHandle.pm |
| 64 | + |
| 65 | +# Dumpvalue - Debug dump utility |
| 66 | +- source: perl5/dist/Dumpvalue/lib/Dumpvalue.pm |
| 67 | + target: src/main/perl/lib/Dumpvalue.pm |
| 68 | + |
| 69 | +# Sys::Hostname - Get system hostname |
| 70 | +- source: perl5/ext/Sys-Hostname/Hostname.pm |
| 71 | + target: src/main/perl/lib/Sys/Hostname.pm |
| 72 | + |
| 73 | +# IPC::Open3 - Open process with 3 filehandles |
| 74 | +- source: perl5/ext/IPC-Open3/lib/IPC/Open3.pm |
| 75 | + target: src/main/perl/lib/IPC/Open3.pm |
| 76 | + |
| 77 | +# Archive::Tar (if IO::Zlib is available) |
| 78 | +- source: perl5/cpan/Archive-Tar/lib/Archive/Tar.pm |
| 79 | + target: src/main/perl/lib/Archive/Tar.pm |
| 80 | +- source: perl5/cpan/Archive-Tar/lib/Archive/Tar |
| 81 | + target: src/main/perl/lib/Archive/Tar |
| 82 | + type: directory |
| 83 | + |
| 84 | +# Net::FTP and libnet modules |
| 85 | +- source: perl5/cpan/libnet/lib/Net |
| 86 | + target: src/main/perl/lib/Net |
| 87 | + type: directory |
| 88 | +``` |
| 89 | +
|
| 90 | +### Modules Requiring Java Implementation |
| 91 | +
|
| 92 | +| Module | Java Implementation Needed | |
| 93 | +|--------|---------------------------| |
| 94 | +| **flock()** | `java.nio.channels.FileLock` in RuntimeIO.java | |
| 95 | +| **IO::Socket** | Wrap `java.net.Socket` / `java.net.ServerSocket` | |
| 96 | +| **Sys::Hostname** (XS part) | `java.net.InetAddress.getLocalHost().getHostName()` | |
| 97 | + |
| 98 | +--- |
| 99 | + |
| 100 | +## The Safe/Opcode Blocker |
| 101 | + |
| 102 | +**Safe.pm** is used by CPAN.pm to safely evaluate CPAN metadata (like `META.yml` code). It depends on **Opcode.pm** which: |
| 103 | + |
| 104 | +1. Uses XSLoader (has C code) |
| 105 | +2. Manipulates Perl's internal opcode tree |
| 106 | +3. Restricts which operations can run in a compartment |
| 107 | + |
| 108 | +### Why This Is Hard |
| 109 | + |
| 110 | +Opcode works by: |
| 111 | +- Enumerating all Perl opcodes (300+) |
| 112 | +- Creating bitmasks to allow/deny specific operations |
| 113 | +- Hooking into Perl's internal compilation |
| 114 | + |
| 115 | +PerlOnJava compiles to JVM bytecode, not Perl opcodes. Implementing Opcode would require: |
| 116 | +- Mapping Perl opcodes to JVM operations |
| 117 | +- Implementing compartmentalization at the JVM level |
| 118 | +- Possibly using Java SecurityManager (deprecated in newer Java) |
| 119 | + |
| 120 | +**Verdict**: Opcode/Safe would require significant architectural work. |
| 121 | + |
| 122 | +--- |
| 123 | + |
| 124 | +## Alternative Approaches |
| 125 | + |
| 126 | +### Option 1: Use cpanm (App::cpanminus) |
| 127 | + |
| 128 | +cpanm is a lighter CPAN client. Need to analyze its dependencies. |
| 129 | + |
| 130 | +```bash |
| 131 | +# Check cpanm dependencies |
| 132 | +curl -s https://cpanmin.us | head -200 |
| 133 | +``` |
| 134 | + |
| 135 | +### Option 2: Minimal CPAN Client |
| 136 | + |
| 137 | +Create a simple CPAN client using modules that already work: |
| 138 | + |
| 139 | +```perl |
| 140 | +# Pseudo-code for minimal CPAN client |
| 141 | +use HTTP::Tiny; |
| 142 | +use Archive::Tar; # needs import |
| 143 | +use File::Temp; |
| 144 | +
|
| 145 | +sub install_module { |
| 146 | + my ($module) = @_; |
| 147 | + |
| 148 | + # 1. Query MetaCPAN API |
| 149 | + my $http = HTTP::Tiny->new; |
| 150 | + my $resp = $http->get("https://fastapi.metacpan.org/v1/download_url/$module"); |
| 151 | + |
| 152 | + # 2. Download tarball |
| 153 | + my $tarball = download($resp->{download_url}); |
| 154 | + |
| 155 | + # 3. Extract |
| 156 | + Archive::Tar->extract_archive($tarball); |
| 157 | + |
| 158 | + # 4. Run Makefile.PL or Build.PL (this is the hard part) |
| 159 | +} |
| 160 | +``` |
| 161 | + |
| 162 | +### Option 3: Pre-bundle Modules |
| 163 | + |
| 164 | +Instead of a CPAN client, import pure-Perl modules directly: |
| 165 | + |
| 166 | +1. Identify commonly needed CPAN modules |
| 167 | +2. Add them to `dev/import-perl5/config.yaml` |
| 168 | +3. Run `perl dev/import-perl5/sync.pl` |
| 169 | + |
| 170 | +This is already working for many modules (Pod::*, Test::*, Getopt::Long, etc.) |
| 171 | + |
| 172 | +--- |
| 173 | + |
| 174 | +## Implementation Priority |
| 175 | + |
| 176 | +### Phase 1: Low-hanging fruit (Easy) |
| 177 | + |
| 178 | +1. **DirHandle** - Add to config.yaml, pure Perl |
| 179 | +2. **Dumpvalue** - Add to config.yaml, pure Perl |
| 180 | +3. **Sys::Hostname** - Import + Java fallback |
| 181 | +4. **flock()** - Implement in Java using FileLock |
| 182 | + |
| 183 | +### Phase 2: Archive/Network (Medium) |
| 184 | + |
| 185 | +5. **Archive::Tar** - Import from perl5 tree (needs IO::Zlib check) |
| 186 | +6. **Archive::Zip** - Java implementation using `java.util.zip` |
| 187 | +7. **IO::Socket** - Java implementation wrapping sockets |
| 188 | +8. **Net::FTP** - Import if IO::Socket works |
| 189 | + |
| 190 | +### Phase 3: Process Control (Medium) |
| 191 | + |
| 192 | +9. **IPC::Open3** - Import + verify pipe support |
| 193 | +10. **IPC::Cmd** - Import if Open3 works |
| 194 | + |
| 195 | +### Phase 4: Consider Alternatives |
| 196 | + |
| 197 | +11. Evaluate cpanm dependencies |
| 198 | +12. Consider minimal custom CPAN client |
| 199 | +13. Document "how to add a CPAN module" for users |
| 200 | + |
| 201 | +--- |
| 202 | + |
| 203 | +## Testing Commands |
| 204 | + |
| 205 | +```bash |
| 206 | +# Test module availability |
| 207 | +./jperl -e 'use DirHandle; print "OK\n"' |
| 208 | +./jperl -e 'use Sys::Hostname; print hostname(), "\n"' |
| 209 | +./jperl -e 'use Archive::Tar; print "OK\n"' |
| 210 | +
|
| 211 | +# Test flock (currently fails) |
| 212 | +./jperl -e 'use Fcntl qw(:flock); open my $fh, "<", $0; flock($fh, LOCK_SH); print "OK\n"' |
| 213 | +``` |
| 214 | + |
| 215 | +--- |
| 216 | + |
| 217 | +## Related Documents |
| 218 | + |
| 219 | +- `dev/design/xsloader.md` - How XSLoader/Java integration works |
| 220 | +- `dev/design/http_server.md` - HTTP capabilities |
| 221 | +- `.cognition/skills/port-cpan-module/` - Skill for porting CPAN modules |
| 222 | + |
| 223 | +--- |
| 224 | + |
| 225 | +## Progress Tracking |
| 226 | + |
| 227 | +### Current Status: Phase 2 complete |
| 228 | + |
| 229 | +### Completed |
| 230 | +- [x] Analyze CPAN.pm dependencies (2024-03-13) |
| 231 | +- [x] Identify modules available in perl5 tree |
| 232 | +- [x] Document sync.pl import strategy |
| 233 | +- [x] Identify Safe/Opcode blocker |
| 234 | +- [x] **Phase 1: Low-hanging fruit** (2024-03-13) |
| 235 | + - DirHandle - imported via sync.pl, fixed Symbol::gensym() to return GLOB reference |
| 236 | + - Dumpvalue - imported via sync.pl, fixed parser bug with `%package:: and` syntax |
| 237 | + - Sys::Hostname - imported via sync.pl, implemented syscall() operator |
| 238 | + - flock() - implemented in CustomFileChannel.java using java.nio.channels.FileLock |
| 239 | + - syscall() - implemented in SyscallOperator.java with gethostname support |
| 240 | +- [x] **Phase 2: Archive/Network modules** (2024-03-13) |
| 241 | + - IO::Socket, IO::Socket::INET, IO::Socket::UNIX - imported via sync.pl |
| 242 | + - IO::Zlib - imported via sync.pl |
| 243 | + - Archive::Tar - imported via sync.pl, patched GZIP_MAGIC_NUM regex (octal to hex) |
| 244 | + - Net::FTP, Net::Cmd, Net::* - imported via sync.pl |
| 245 | + - Tie::StdHandle - added for IO::Zlib dependency |
| 246 | + - File::Spec platform modules - added for Archive::Tar dependency |
| 247 | + - Socket.pm - added $VERSION and additional constants (INADDR_*, IPPROTO_*, SHUT_*, etc.) |
| 248 | + - Parser fix: `@{${...}}` nested dereference now works in push/unshift |
| 249 | + - SysHostname.java XS module - provides ghname() via InetAddress.getLocalHost() |
| 250 | + - XSLoader caller() support - load() now uses caller() when no argument provided |
| 251 | + |
| 252 | +### Files Changed (Phase 2) |
| 253 | +- `dev/import-perl5/config.yaml` - Added IO::Socket, IO::Zlib, Archive::Tar, Net::*, Tie::StdHandle, File::Spec imports |
| 254 | +- `src/main/java/org/perlonjava/runtime/perlmodule/Socket.java` - Added 20+ socket constants |
| 255 | +- `src/main/perl/lib/Socket.pm` - Added $VERSION and expanded exports |
| 256 | +- `src/main/java/org/perlonjava/frontend/parser/IdentifierParser.java` - Fixed `$` followed by `{` in braced variable parsing |
| 257 | +- `src/main/java/org/perlonjava/runtime/perlmodule/SysHostname.java` - New XS module for Sys::Hostname |
| 258 | +- `src/main/java/org/perlonjava/runtime/perlmodule/XSLoader.java` - Added caller() support for no-argument load() |
| 259 | + |
| 260 | +### Next Steps |
| 261 | +1. Phase 3: Process control (IPC::Open3) |
| 262 | +2. Evaluate cpanm as alternative to CPAN.pm |
| 263 | + |
| 264 | +### Open Questions |
| 265 | +- Is cpanm lighter on dependencies than CPAN.pm? |
| 266 | +- Should we create a PerlOnJava-specific minimal CPAN client? |
| 267 | +- How important is Safe compartmentalization for users? |
0 commit comments