Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 26 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Dataforge

Ferramenta para geração de datasets sintéticos **relacionais** com integridade referencial garantida. Disponível via interface visual no navegador e via linha de comando (CLI). Ideal para testar pipelines de dados, popular bancos de desenvolvimento e criar fixtures para modelos dbt — sem usar dados sensíveis.
Ferramenta para geração de datasets sintéticos **relacionais** com integridade referencial garantida. Disponível via **interface visual no navegador** e via **linha de comando (CLI)**. Suporta múltiplos formatos de saída (CSV, JSON, Parquet, Avro), carga direta em bancos SQL (PostgreSQL, MySQL, SQLite) e upload automático para nuvem (GCS, S3, Azure). Ideal para testar pipelines de dados, popular bancos de desenvolvimento, criar fixtures para modelos dbt e gerar dados de demonstração — sem usar dados sensíveis.

---

Expand Down Expand Up @@ -71,10 +71,12 @@ A interface visual roda em `http://localhost:5173` e é a forma principal de uso
- **Save as Default** — salva o schema no servidor (`src/dataforge/schemas/`) para reutilização futura
- **Run Generator** — executa o CLI diretamente da interface com configuração visual completa:
- Formatos de saída (CSV, JSON, Parquet, Avro) e modo JSON (flat/nested)
- Destino: **local**, **nuvem** (GCS, S3, Azure) ou **banco de dados** (PostgreSQL, MySQL, SQLite) com teste de conexão e conexões salvas
- Destino: **local** (com seletor de pasta nativo no Windows), **nuvem** (GCS, S3, Azure) ou **banco de dados** (PostgreSQL, MySQL, SQLite) com teste de conexão e conexões salvas
- Credenciais cloud inseridas diretamente na UI (GCS JSON, S3 Access Key/Secret, Azure Connection String) com suporte a **perfis salvos** — salve e carregue credenciais por nome sem precisar de arquivos externos
- Particionamento Hive-style por tabela
- Modo recorrente, seed e incrementos de coluna
- Logs de execução em tempo real com botão de parada
- Botão **? Help** com referência de todos os campos

O diagrama é atualizado em tempo real e mostra as relações entre tabelas com setas representando FKs.

Expand Down Expand Up @@ -361,12 +363,30 @@ docker compose run --rm cli generate -d ecommerce -f parquet --partition-by "ord

## Upload em nuvem

Coloque o arquivo de credenciais na pasta `credentials/` do projeto (ela é montada no container em `/app/credentials/`).
Há duas formas de fornecer credenciais cloud ao Dataforge:

### Opção 1 — Interface Visual (recomendada)

Na seção **Destination → Cloud** do Run Generator, insira as credenciais diretamente na UI:

| Provider | Campos |
|----------|--------|
| Google Cloud Storage | JSON completo da Service Account |
| Amazon S3 | Access Key ID, Secret Access Key e Region |
| Azure Blob Storage | Connection String |

Clique em **Save credentials** para salvar um perfil nomeado localmente (`credentials/profiles.json`). Perfis salvos aparecem no topo da seção Cloud e podem ser carregados com um clique.

> **Nota:** o seletor de pasta (📁) no destino Local só funciona quando o Dataforge roda localmente no Windows. No Docker, digite o caminho manualmente (ex: `/app/output/dados`).

### Opção 2 — Arquivo na pasta `credentials/`

Coloque o arquivo de credenciais na pasta `credentials/` do projeto (ela é montada no container em `/app/credentials/`). As credenciais da UI têm prioridade; a pasta serve de fallback.

### Google Cloud Storage

```bash
# Usando arquivo de service account
# Via arquivo de service account (fallback)
docker compose run --rm cli generate -d ecommerce -f parquet \
--upload gcs \
--bucket meu-bucket \
Expand All @@ -377,10 +397,11 @@ docker compose run --rm cli generate -d ecommerce -f parquet \
### Amazon S3

```bash
# Autenticação via variáveis de ambiente no docker-compose ou inline
# Via variáveis de ambiente
docker compose run --rm \
-e AWS_ACCESS_KEY_ID=... \
-e AWS_SECRET_ACCESS_KEY=... \
-e AWS_DEFAULT_REGION=us-east-1 \
cli generate -d hr -f csv \
--upload s3 \
--bucket meu-bucket \
Expand Down
175 changes: 168 additions & 7 deletions src/dataforge/frontend/src/App.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -615,6 +615,43 @@ export default function App() {

const [showRunPanel, setShowRunPanel] = useState(false);
const [showRunHelp, setShowRunHelp] = useState(false);
const [canBrowseFolder, setCanBrowseFolder] = useState(false);
const [credProfiles, setCredProfiles] = useState<{ name: string; provider: string }[]>([]);
const [saveCredName, setSaveCredName] = useState('');
const [showSaveCredInput, setShowSaveCredInput] = useState(false);

React.useEffect(() => {
fetch('/api/capabilities').then(r => r.json()).then(d => setCanBrowseFolder(!!d.browseFolder)).catch(() => {});
fetchCredProfiles();
}, []);

const fetchCredProfiles = () => {
fetch('/api/credential-profiles').then(r => r.json()).then(setCredProfiles).catch(() => {});
};

const handleSaveCredProfile = async () => {
const name = saveCredName.trim();
if (!name) return;
await fetch('/api/credential-profiles', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ name, provider: runConfig.uploadTarget, creds: runConfig.cloudCreds }),
});
setSaveCredName('');
setShowSaveCredInput(false);
fetchCredProfiles();
};

const handleLoadCredProfile = async (name: string) => {
const res = await fetch(`/api/credential-profiles/${encodeURIComponent(name)}`);
const profile = await res.json();
setRunConfig(r => ({ ...r, uploadTarget: profile.provider, cloudCreds: profile.creds }));
};

const handleDeleteCredProfile = async (name: string) => {
await fetch(`/api/credential-profiles/${encodeURIComponent(name)}`, { method: 'DELETE' });
fetchCredProfiles();
};
const [runConfig, setRunConfig] = useState<{
formats: string[],
destination: 'local' | 'cloud' | 'database',
Expand All @@ -634,6 +671,13 @@ export default function App() {
tablesToInclude: string[],
columnsFilter: string,
increments: Array<{ table: string; column: string; step: string; unit: string }>,
cloudCreds: {
gcsJson: string,
s3AccessKey: string,
s3SecretKey: string,
s3Region: string,
azureConnStr: string,
},
}>({
formats: ['csv'],
destination: 'local' as 'local' | 'cloud' | 'database',
Expand All @@ -653,6 +697,13 @@ export default function App() {
tablesToInclude: [],
columnsFilter: '',
increments: [],
cloudCreds: {
gcsJson: '',
s3AccessKey: '',
s3SecretKey: '',
s3Region: 'us-east-1',
azureConnStr: '',
},
});
const [runLogs, setRunLogs] = useState('');
const [isRunning, setIsRunning] = useState(false);
Expand Down Expand Up @@ -688,6 +739,7 @@ export default function App() {
tables: runConfig.tablesToInclude.length > 0 ? runConfig.tablesToInclude : undefined,
columns: runConfig.columnsFilter.trim() ? runConfig.columnsFilter.trim().split('\n').filter(Boolean) : undefined,
increments: runConfig.increments.filter(i => i.table && i.column && i.step !== ''),
cloudCreds: runConfig.destination === 'cloud' ? runConfig.cloudCreds : undefined,
})
});

Expand Down Expand Up @@ -1297,7 +1349,7 @@ export default function App() {
<label style={{ display: 'block', marginBottom: '0.5rem', color: '#cbd5e1', fontSize: '0.85rem' }}>Output Directory</label>
<div style={{ display: 'flex', gap: '0.5rem' }}>
<input type="text" value={runConfig.outputDir} onChange={e => setRunConfig(r => ({...r, outputDir: e.target.value}))} style={{ flex: 1, padding: '0.5rem' }} placeholder="e.g. output" />
<button
{canBrowseFolder && <button
onClick={async (e) => {
const btn = e.currentTarget;
if (btn.disabled) return;
Expand All @@ -1316,14 +1368,37 @@ export default function App() {
style={{ padding: '0.5rem 0.75rem', borderRadius: '6px', border: '1px solid rgba(255,255,255,0.15)', background: 'rgba(255,255,255,0.07)', color: '#94a3b8', cursor: 'pointer', fontSize: '1rem', whiteSpace: 'nowrap' }}
>
📁
</button>
</button>}
</div>
</div>
)}

{/* Cloud */}
{runConfig.destination === 'cloud' && (
<div style={{ display: 'flex', flexDirection: 'column', gap: '0.75rem' }}>

{/* Saved credential profiles */}
{credProfiles.filter(p => p.provider === runConfig.uploadTarget).length > 0 && (
<div>
<label style={{ display: 'block', marginBottom: '0.4rem', color: '#cbd5e1', fontSize: '0.85rem' }}>Saved Credentials</label>
<div style={{ display: 'flex', flexDirection: 'column', gap: '0.3rem' }}>
{credProfiles.filter(p => p.provider === runConfig.uploadTarget).map(p => (
<div key={p.name} style={{ display: 'flex', alignItems: 'center', gap: '0.5rem', background: 'rgba(255,255,255,0.04)', borderRadius: '6px', padding: '0.4rem 0.6rem', border: '1px solid rgba(255,255,255,0.08)' }}>
<button type="button" onClick={() => handleLoadCredProfile(p.name)}
style={{ flex: 1, background: 'none', border: 'none', color: '#e2e8f0', fontSize: '0.82rem', cursor: 'pointer', textAlign: 'left', padding: 0 }}>
{p.name}
<span style={{ marginLeft: '0.5rem', color: '#475569', fontSize: '0.72rem' }}>{p.provider.toUpperCase()}</span>
</button>
<button type="button" onClick={() => handleDeleteCredProfile(p.name)}
style={{ background: 'none', border: 'none', color: '#475569', cursor: 'pointer', fontSize: '0.85rem', padding: '0 0.2rem' }}
title="Remove">✕</button>
</div>
))}
</div>
</div>
)}

{/* Provider */}
<div>
<label style={{ display: 'block', marginBottom: '0.5rem', color: '#cbd5e1', fontSize: '0.85rem' }}>Provider</label>
<select value={runConfig.uploadTarget} onChange={e => setRunConfig(r => ({...r, uploadTarget: e.target.value}))} style={{ width: '100%', padding: '0.5rem', background: 'rgba(255,255,255,0.05)', color: 'white', border: '1px solid rgba(255,255,255,0.1)', borderRadius: '6px' }}>
Expand All @@ -1332,6 +1407,8 @@ export default function App() {
<option value="azure" style={{color: 'black'}}>Azure Blob Storage</option>
</select>
</div>

{/* Bucket + Prefix */}
<div style={{ display: 'flex', gap: '1rem' }}>
<div style={{ flex: 1 }}>
<label style={{ display: 'block', marginBottom: '0.5rem', color: '#cbd5e1', fontSize: '0.85rem' }}>Bucket / Container</label>
Expand All @@ -1342,9 +1419,76 @@ export default function App() {
<input type="text" value={runConfig.prefix} onChange={e => setRunConfig(r => ({...r, prefix: e.target.value}))} style={{ width: '100%', padding: '0.5rem' }} placeholder="e.g. datasets/" />
</div>
</div>
<p style={{ margin: 0, fontSize: '0.75rem', color: '#475569', display: 'flex', alignItems: 'center', gap: '0.4rem' }}>
<span style={{ color: '#10b981' }}>✓</span> Credentials auto-loaded from <code style={{ color: '#94a3b8' }}>credentials/</code>
</p>

{/* Credentials — per provider */}
<div style={{ borderTop: '1px solid rgba(255,255,255,0.06)', paddingTop: '0.75rem' }}>
<div style={{ display: 'flex', justifyContent: 'space-between', alignItems: 'center', marginBottom: '0.6rem' }}>
<p style={{ margin: 0, fontSize: '0.72rem', textTransform: 'uppercase', letterSpacing: '0.06em', color: '#475569' }}>Credentials</p>
{showSaveCredInput ? (
<div style={{ display: 'flex', gap: '0.4rem', alignItems: 'center' }}>
<input
type="text"
value={saveCredName}
onChange={e => setSaveCredName(e.target.value)}
onKeyDown={e => { if (e.key === 'Enter') handleSaveCredProfile(); if (e.key === 'Escape') setShowSaveCredInput(false); }}
placeholder="Profile name..."
autoFocus
style={{ padding: '0.25rem 0.5rem', fontSize: '0.78rem', background: 'rgba(255,255,255,0.07)', border: '1px solid rgba(255,255,255,0.15)', borderRadius: '5px', color: 'white', width: '130px' }}
/>
<button type="button" onClick={handleSaveCredProfile}
style={{ padding: '0.25rem 0.5rem', fontSize: '0.75rem', background: 'rgba(16,185,129,0.15)', border: '1px solid rgba(16,185,129,0.3)', borderRadius: '5px', color: '#10b981', cursor: 'pointer' }}>
Save
</button>
<button type="button" onClick={() => setShowSaveCredInput(false)}
style={{ background: 'none', border: 'none', color: '#475569', cursor: 'pointer', fontSize: '0.85rem' }}>✕</button>
</div>
) : (
<button type="button" onClick={() => setShowSaveCredInput(true)}
style={{ background: 'none', border: 'none', color: '#64748b', fontSize: '0.75rem', cursor: 'pointer', textDecoration: 'underline', padding: 0 }}>
Save credentials
</button>
)}
</div>

{runConfig.uploadTarget === 'gcs' && (
<div>
<label style={{ display: 'block', marginBottom: '0.5rem', color: '#cbd5e1', fontSize: '0.85rem' }}>Service Account JSON</label>
<textarea
value={runConfig.cloudCreds.gcsJson}
onChange={e => setRunConfig(r => ({...r, cloudCreds: {...r.cloudCreds, gcsJson: e.target.value}}))}
rows={5}
placeholder={'{\n "type": "service_account",\n "project_id": "...",\n ...\n}'}
style={{ width: '100%', padding: '0.5rem', background: 'rgba(255,255,255,0.04)', border: '1px solid rgba(255,255,255,0.1)', borderRadius: '6px', color: 'white', resize: 'vertical', fontSize: '0.78rem', fontFamily: 'monospace', boxSizing: 'border-box' }}
/>
</div>
)}

{runConfig.uploadTarget === 's3' && (
<div style={{ display: 'flex', flexDirection: 'column', gap: '0.6rem' }}>
<div style={{ display: 'flex', gap: '0.75rem' }}>
<div style={{ flex: 1 }}>
<label style={{ display: 'block', marginBottom: '0.4rem', color: '#cbd5e1', fontSize: '0.85rem' }}>Access Key ID</label>
<input type="text" value={runConfig.cloudCreds.s3AccessKey} onChange={e => setRunConfig(r => ({...r, cloudCreds: {...r.cloudCreds, s3AccessKey: e.target.value}}))} style={{ width: '100%', padding: '0.5rem', boxSizing: 'border-box' }} placeholder="AKIA..." />
</div>
<div style={{ flex: 1 }}>
<label style={{ display: 'block', marginBottom: '0.4rem', color: '#cbd5e1', fontSize: '0.85rem' }}>Secret Access Key</label>
<input type="password" value={runConfig.cloudCreds.s3SecretKey} onChange={e => setRunConfig(r => ({...r, cloudCreds: {...r.cloudCreds, s3SecretKey: e.target.value}}))} style={{ width: '100%', padding: '0.5rem', boxSizing: 'border-box' }} placeholder="••••••••" />
</div>
</div>
<div style={{ flex: 1 }}>
<label style={{ display: 'block', marginBottom: '0.4rem', color: '#cbd5e1', fontSize: '0.85rem' }}>Region</label>
<input type="text" value={runConfig.cloudCreds.s3Region} onChange={e => setRunConfig(r => ({...r, cloudCreds: {...r.cloudCreds, s3Region: e.target.value}}))} style={{ width: '100%', padding: '0.5rem', boxSizing: 'border-box' }} placeholder="us-east-1" />
</div>
</div>
)}

{runConfig.uploadTarget === 'azure' && (
<div>
<label style={{ display: 'block', marginBottom: '0.5rem', color: '#cbd5e1', fontSize: '0.85rem' }}>Connection String</label>
<input type="password" value={runConfig.cloudCreds.azureConnStr} onChange={e => setRunConfig(r => ({...r, cloudCreds: {...r.cloudCreds, azureConnStr: e.target.value}}))} style={{ width: '100%', padding: '0.5rem', boxSizing: 'border-box' }} placeholder="DefaultEndpointsProtocol=https;AccountName=...;AccountKey=...;EndpointSuffix=core.windows.net" />
</div>
)}
</div>
</div>
)}

Expand Down Expand Up @@ -1690,9 +1834,24 @@ export default function App() {
}
</div>

{(() => {
const validationError =
runConfig.destination === 'cloud' && !runConfig.bucket.trim()
? 'Bucket / Container is required for cloud upload.'
: null;
return validationError ? (
<p style={{ margin: '0 0 0.5rem', fontSize: '0.78rem', color: '#f87171', display: 'flex', alignItems: 'center', gap: '0.4rem' }}>
⚠ {validationError}
</p>
) : null;
})()}

{(() => {
const disabled = isRunning || (runConfig.destination === 'cloud' && !runConfig.bucket.trim());
return (
<div style={{ display: 'flex', gap: '0.6rem' }}>
<button className="btn-primary" onClick={handleRunCli} disabled={isRunning}
style={{ flex: 1, padding: '0.75rem', background: '#10b981', borderColor: '#10b981', fontSize: '1rem', opacity: isRunning ? 0.5 : 1, cursor: isRunning ? 'not-allowed' : 'pointer' }}>
<button className="btn-primary" onClick={handleRunCli} disabled={disabled}
style={{ flex: 1, padding: '0.75rem', background: '#10b981', borderColor: '#10b981', fontSize: '1rem', opacity: disabled ? 0.5 : 1, cursor: disabled ? 'not-allowed' : 'pointer' }}>
{isRunning ? 'Running…' : 'Execute Dataforge CLI'}
</button>
{isRunning && (
Expand All @@ -1702,6 +1861,8 @@ export default function App() {
</button>
)}
</div>
);
})()}
</div>
</div>
</div>
Expand Down
Loading
Loading