Skip to content

Commit 3208cd2

Browse files
authored
Merge pull request #312 from fglock/feature/cpan-phase1-modules
Add CPAN Phases 1 & 2: Socket, Archive::Tar, Net::FTP, and more
2 parents 64b7ac7 + 21ab334 commit 3208cd2

59 files changed

Lines changed: 19174 additions & 186 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

dev/design/cpan_client.md

Lines changed: 267 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,267 @@
1+
# CPAN Client Support for PerlOnJava
2+
3+
## Overview
4+
5+
This document analyzes what's needed to run CPAN.pm (or alternatives) on PerlOnJava.
6+
7+
## Current Status
8+
9+
CPAN.pm has deep dependencies that make it challenging to port. The main blocker is `Safe`/`Opcode` which requires access to Perl's internal opcode system.
10+
11+
---
12+
13+
## CPAN.pm Dependency Analysis
14+
15+
### Available (Already Working)
16+
17+
| Module | Status |
18+
|--------|--------|
19+
| File::Spec, File::Basename, File::Copy, File::Find, File::Path, File::Temp ||
20+
| Text::ParseWords, Text::Wrap ||
21+
| Config, Carp, Cwd, Exporter, Fcntl ||
22+
| FileHandle, IO::File, IO::Handle ||
23+
| HTTP::Tiny, Compress::Zlib ||
24+
| Digest::MD5, Digest::SHA, MIME::Base64 ||
25+
| YAML, JSON, Term::ReadLine ||
26+
27+
### Critical Missing Modules
28+
29+
| Module | Status | Complexity | Notes |
30+
|--------|--------|------------|-------|
31+
| **Safe** | ❌ Missing | High | Sandbox/compartment module - requires Opcode |
32+
| **Opcode** | ❌ Missing | Very High | Core opcodes restriction - deeply tied to Perl internals |
33+
| **DirHandle** | ✅ Done | Low | OO interface to opendir/readdir - imported via sync.pl |
34+
| **Sys::Hostname** | ✅ Done | Low | `hostname()` function - SysHostname.java XS module |
35+
| **ExtUtils::MakeMaker** | ❌ Missing | Very High | Build system - huge module with many dependencies |
36+
| **LWP::UserAgent** | ❌ Missing | Medium | Web client (HTTP::Tiny exists as alternative) |
37+
| **Archive::Tar** | ✅ Done | Medium | Imported via sync.pl |
38+
| **Archive::Zip** | ❌ Missing | Medium | Zip handling - Java has built-in support |
39+
| **Net::FTP** | ✅ Done | Medium | Imported via sync.pl |
40+
| **IPC::Open3** | ❌ Missing | Medium | Process I/O - needs Java ProcessBuilder |
41+
| **IO::Socket** | ✅ Done | Medium | Imported via sync.pl |
42+
| **Dumpvalue** | ✅ Done | Low | Imported via sync.pl |
43+
44+
### Built-in Functions Missing
45+
46+
| Function | Status | Notes |
47+
|----------|--------|-------|
48+
| `flock()` | ✅ Implemented | File locking - using java.nio.channels.FileLock |
49+
50+
---
51+
52+
## Import Strategy via sync.pl
53+
54+
The `dev/import-perl5/sync.pl` script can import pure Perl modules from the perl5 source tree.
55+
56+
### Quick Wins - Add to config.yaml
57+
58+
These modules can be imported directly:
59+
60+
```yaml
61+
# DirHandle - OO directory handle interface
62+
- source: perl5/lib/DirHandle.pm
63+
target: src/main/perl/lib/DirHandle.pm
64+
65+
# Dumpvalue - Debug dump utility
66+
- source: perl5/dist/Dumpvalue/lib/Dumpvalue.pm
67+
target: src/main/perl/lib/Dumpvalue.pm
68+
69+
# Sys::Hostname - Get system hostname
70+
- source: perl5/ext/Sys-Hostname/Hostname.pm
71+
target: src/main/perl/lib/Sys/Hostname.pm
72+
73+
# IPC::Open3 - Open process with 3 filehandles
74+
- source: perl5/ext/IPC-Open3/lib/IPC/Open3.pm
75+
target: src/main/perl/lib/IPC/Open3.pm
76+
77+
# Archive::Tar (if IO::Zlib is available)
78+
- source: perl5/cpan/Archive-Tar/lib/Archive/Tar.pm
79+
target: src/main/perl/lib/Archive/Tar.pm
80+
- source: perl5/cpan/Archive-Tar/lib/Archive/Tar
81+
target: src/main/perl/lib/Archive/Tar
82+
type: directory
83+
84+
# Net::FTP and libnet modules
85+
- source: perl5/cpan/libnet/lib/Net
86+
target: src/main/perl/lib/Net
87+
type: directory
88+
```
89+
90+
### Modules Requiring Java Implementation
91+
92+
| Module | Java Implementation Needed |
93+
|--------|---------------------------|
94+
| **flock()** | `java.nio.channels.FileLock` in RuntimeIO.java |
95+
| **IO::Socket** | Wrap `java.net.Socket` / `java.net.ServerSocket` |
96+
| **Sys::Hostname** (XS part) | `java.net.InetAddress.getLocalHost().getHostName()` |
97+
98+
---
99+
100+
## The Safe/Opcode Blocker
101+
102+
**Safe.pm** is used by CPAN.pm to safely evaluate CPAN metadata (like `META.yml` code). It depends on **Opcode.pm** which:
103+
104+
1. Uses XSLoader (has C code)
105+
2. Manipulates Perl's internal opcode tree
106+
3. Restricts which operations can run in a compartment
107+
108+
### Why This Is Hard
109+
110+
Opcode works by:
111+
- Enumerating all Perl opcodes (300+)
112+
- Creating bitmasks to allow/deny specific operations
113+
- Hooking into Perl's internal compilation
114+
115+
PerlOnJava compiles to JVM bytecode, not Perl opcodes. Implementing Opcode would require:
116+
- Mapping Perl opcodes to JVM operations
117+
- Implementing compartmentalization at the JVM level
118+
- Possibly using Java SecurityManager (deprecated in newer Java)
119+
120+
**Verdict**: Opcode/Safe would require significant architectural work.
121+
122+
---
123+
124+
## Alternative Approaches
125+
126+
### Option 1: Use cpanm (App::cpanminus)
127+
128+
cpanm is a lighter CPAN client. Need to analyze its dependencies.
129+
130+
```bash
131+
# Check cpanm dependencies
132+
curl -s https://cpanmin.us | head -200
133+
```
134+
135+
### Option 2: Minimal CPAN Client
136+
137+
Create a simple CPAN client using modules that already work:
138+
139+
```perl
140+
# Pseudo-code for minimal CPAN client
141+
use HTTP::Tiny;
142+
use Archive::Tar; # needs import
143+
use File::Temp;
144+
145+
sub install_module {
146+
my ($module) = @_;
147+
148+
# 1. Query MetaCPAN API
149+
my $http = HTTP::Tiny->new;
150+
my $resp = $http->get("https://fastapi.metacpan.org/v1/download_url/$module");
151+
152+
# 2. Download tarball
153+
my $tarball = download($resp->{download_url});
154+
155+
# 3. Extract
156+
Archive::Tar->extract_archive($tarball);
157+
158+
# 4. Run Makefile.PL or Build.PL (this is the hard part)
159+
}
160+
```
161+
162+
### Option 3: Pre-bundle Modules
163+
164+
Instead of a CPAN client, import pure-Perl modules directly:
165+
166+
1. Identify commonly needed CPAN modules
167+
2. Add them to `dev/import-perl5/config.yaml`
168+
3. Run `perl dev/import-perl5/sync.pl`
169+
170+
This is already working for many modules (Pod::*, Test::*, Getopt::Long, etc.)
171+
172+
---
173+
174+
## Implementation Priority
175+
176+
### Phase 1: Low-hanging fruit (Easy)
177+
178+
1. **DirHandle** - Add to config.yaml, pure Perl
179+
2. **Dumpvalue** - Add to config.yaml, pure Perl
180+
3. **Sys::Hostname** - Import + Java fallback
181+
4. **flock()** - Implement in Java using FileLock
182+
183+
### Phase 2: Archive/Network (Medium)
184+
185+
5. **Archive::Tar** - Import from perl5 tree (needs IO::Zlib check)
186+
6. **Archive::Zip** - Java implementation using `java.util.zip`
187+
7. **IO::Socket** - Java implementation wrapping sockets
188+
8. **Net::FTP** - Import if IO::Socket works
189+
190+
### Phase 3: Process Control (Medium)
191+
192+
9. **IPC::Open3** - Import + verify pipe support
193+
10. **IPC::Cmd** - Import if Open3 works
194+
195+
### Phase 4: Consider Alternatives
196+
197+
11. Evaluate cpanm dependencies
198+
12. Consider minimal custom CPAN client
199+
13. Document "how to add a CPAN module" for users
200+
201+
---
202+
203+
## Testing Commands
204+
205+
```bash
206+
# Test module availability
207+
./jperl -e 'use DirHandle; print "OK\n"'
208+
./jperl -e 'use Sys::Hostname; print hostname(), "\n"'
209+
./jperl -e 'use Archive::Tar; print "OK\n"'
210+
211+
# Test flock (currently fails)
212+
./jperl -e 'use Fcntl qw(:flock); open my $fh, "<", $0; flock($fh, LOCK_SH); print "OK\n"'
213+
```
214+
215+
---
216+
217+
## Related Documents
218+
219+
- `dev/design/xsloader.md` - How XSLoader/Java integration works
220+
- `dev/design/http_server.md` - HTTP capabilities
221+
- `.cognition/skills/port-cpan-module/` - Skill for porting CPAN modules
222+
223+
---
224+
225+
## Progress Tracking
226+
227+
### Current Status: Phase 2 complete
228+
229+
### Completed
230+
- [x] Analyze CPAN.pm dependencies (2024-03-13)
231+
- [x] Identify modules available in perl5 tree
232+
- [x] Document sync.pl import strategy
233+
- [x] Identify Safe/Opcode blocker
234+
- [x] **Phase 1: Low-hanging fruit** (2024-03-13)
235+
- DirHandle - imported via sync.pl, fixed Symbol::gensym() to return GLOB reference
236+
- Dumpvalue - imported via sync.pl, fixed parser bug with `%package:: and` syntax
237+
- Sys::Hostname - imported via sync.pl, implemented syscall() operator
238+
- flock() - implemented in CustomFileChannel.java using java.nio.channels.FileLock
239+
- syscall() - implemented in SyscallOperator.java with gethostname support
240+
- [x] **Phase 2: Archive/Network modules** (2024-03-13)
241+
- IO::Socket, IO::Socket::INET, IO::Socket::UNIX - imported via sync.pl
242+
- IO::Zlib - imported via sync.pl
243+
- Archive::Tar - imported via sync.pl, patched GZIP_MAGIC_NUM regex (octal to hex)
244+
- Net::FTP, Net::Cmd, Net::* - imported via sync.pl
245+
- Tie::StdHandle - added for IO::Zlib dependency
246+
- File::Spec platform modules - added for Archive::Tar dependency
247+
- Socket.pm - added $VERSION and additional constants (INADDR_*, IPPROTO_*, SHUT_*, etc.)
248+
- Parser fix: `@{${...}}` nested dereference now works in push/unshift
249+
- SysHostname.java XS module - provides ghname() via InetAddress.getLocalHost()
250+
- XSLoader caller() support - load() now uses caller() when no argument provided
251+
252+
### Files Changed (Phase 2)
253+
- `dev/import-perl5/config.yaml` - Added IO::Socket, IO::Zlib, Archive::Tar, Net::*, Tie::StdHandle, File::Spec imports
254+
- `src/main/java/org/perlonjava/runtime/perlmodule/Socket.java` - Added 20+ socket constants
255+
- `src/main/perl/lib/Socket.pm` - Added $VERSION and expanded exports
256+
- `src/main/java/org/perlonjava/frontend/parser/IdentifierParser.java` - Fixed `$` followed by `{` in braced variable parsing
257+
- `src/main/java/org/perlonjava/runtime/perlmodule/SysHostname.java` - New XS module for Sys::Hostname
258+
- `src/main/java/org/perlonjava/runtime/perlmodule/XSLoader.java` - Added caller() support for no-argument load()
259+
260+
### Next Steps
261+
1. Phase 3: Process control (IPC::Open3)
262+
2. Evaluate cpanm as alternative to CPAN.pm
263+
264+
### Open Questions
265+
- Is cpanm lighter on dependencies than CPAN.pm?
266+
- Should we create a PerlOnJava-specific minimal CPAN client?
267+
- How important is Safe compartmentalization for users?

dev/import-perl5/config.yaml

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -400,6 +400,57 @@ imports:
400400
target: perl5_t/Term-Table
401401
type: directory
402402

403+
# DirHandle - OO interface to directory handles (pure Perl)
404+
- source: perl5/lib/DirHandle.pm
405+
target: src/main/perl/lib/DirHandle.pm
406+
407+
# Dumpvalue - Debug dump utility (pure Perl)
408+
- source: perl5/dist/Dumpvalue/lib/Dumpvalue.pm
409+
target: src/main/perl/lib/Dumpvalue.pm
410+
411+
# Sys::Hostname - Get system hostname
412+
- source: perl5/ext/Sys-Hostname/Hostname.pm
413+
target: src/main/perl/lib/Sys/Hostname.pm
414+
415+
# Phase 2: IO::Socket - OO socket interface
416+
- source: perl5/dist/IO/lib/IO/Socket.pm
417+
target: src/main/perl/lib/IO/Socket.pm
418+
419+
- source: perl5/dist/IO/lib/IO/Socket
420+
target: src/main/perl/lib/IO/Socket
421+
type: directory
422+
423+
# Phase 2: IO::Zlib - Compressed I/O (for Archive::Tar)
424+
- source: perl5/cpan/IO-Zlib/Zlib.pm
425+
target: src/main/perl/lib/IO/Zlib.pm
426+
427+
# Tie::StdHandle - Required by IO::Zlib
428+
- source: perl5/lib/Tie/StdHandle.pm
429+
target: src/main/perl/lib/Tie/StdHandle.pm
430+
431+
# File::Spec platform modules - Required by Archive::Tar
432+
- source: perl5/dist/PathTools/lib/File/Spec
433+
target: src/main/perl/lib/File/Spec
434+
type: directory
435+
436+
# Phase 2: Archive::Tar - Tar archive handling
437+
- source: perl5/cpan/Archive-Tar/lib/Archive/Tar.pm
438+
target: src/main/perl/lib/Archive/Tar.pm
439+
440+
- source: perl5/cpan/Archive-Tar/lib/Archive/Tar
441+
target: src/main/perl/lib/Archive/Tar
442+
type: directory
443+
444+
# Phase 2: Net::FTP and libnet modules
445+
- source: perl5/cpan/libnet/lib/Net
446+
target: src/main/perl/lib/Net
447+
type: directory
448+
449+
# Symbol - manipulate Perl symbols and their names (pure Perl)
450+
# Required by constant.pm which is used by File::Spec::Unix
451+
- source: perl5/lib/Symbol.pm
452+
target: src/main/perl/lib/Symbol.pm
453+
403454
# Add more imports below as needed
404455
# Example with minimal fields:
405456
# - source: perl5/lib/SomeModule.pm

0 commit comments

Comments
 (0)