WikiGen
Enterprise Onboarding
— Confidential Research Infrastructure · v1.2

Collaborate on data you can't see.

WikiGen is a hardware-confidential interface layer over sensitive corpora — research data, clinical cohorts, proprietary pipelines. Sources stay where they live. Compute runs inside NVIDIA Confidential Compute enclaves where nobody, including our own operators, can read the raw values. Collaboration without disclosure.

Fig. 01  ·  Generative wrapper rendering in real time. Move your cursor to explore.
— The Principles

Three guarantees that shape every tenant.

Privacy-first research collaboration demands more than NDAs. WikiGen is built around three hardware- and economics-level non-negotiables — the constraints we deploy against, audit against, and contract against.

01

Confidential by hardware.

Ultra-secure, hardware-level encryption keeps sensitive research data protected during processing, not just at rest. Workloads execute inside NVIDIA Confidential Compute GPUs with remote attestation on every run. Even WikiGen's own operators cannot read the values flowing through your tenant.

02

Fair to contributors.

A fair economic model ensures data contributors are properly rewarded for their valuable contributions. Every query is traceable to the corpus it drew from; every payout is auditable; every derivative model carries royalty-bearing lineage. No zero-sum grabs on someone else's dataset.

03

Accessible to science.

Breaks down the institutional walls between researchers and the valuable datasets they need to accelerate discovery. SOC 2 Type II, HIPAA, and GDPR-aligned infrastructure removes the legal friction that typically keeps clinical cohorts and proprietary corpora locked inside a single lab.

— The Product

Three surfaces, one enclave.

Your users never touch the enclave directly. They touch one of three purpose-built surfaces on top of it — each scoped, each attested, each only visible to the roles you assign.

Fig. 02For your team
┌─ RESEARCH PORTAL ──────────┐
│                            │
│ Corpora                    │
│ ├─ UK Biobank              │
│ ├─ MIMIC-IV                │
│ └─ + 3 more                │
│                            │
│ Workspace                  │
│ $ query · attested · ok    │
│                            │
└────────────────────────────┘

Research Portal

Where your scientists query, annotate, and cite. Wrap your existing datasets without migration; sources stay where they live. Every query returns with attestation metadata — who ran it, where, against which corpus. Collaboration without disclosure, inside your institution.

Fig. 03For contributors
┌─ DATASET MARKETPLACE ──────┐
│                            │
│ [corpus] [corpus] [corpus] │
│                            │
│ + 1,248 queries · 30d      │
│ + royalty · 0.04¢ / query  │
│                            │
│ [list a corpus]          > │
│                            │
└────────────────────────────┘

Dataset Marketplace

Where data stewards list corpora for royalty-bearing query access. A fair economic model ensures data contributors are properly rewarded for their valuable contributions. Every downstream run is traceable to the corpus it drew from; every payout is auditable.

Fig. 04For quants
┌─ ANALYSIS ENVIRONMENT ─────┐
│                            │
│ # jupyter · attested       │
│ > from wikigen import run  │
│ > run(corpus='biobank')    │
│                            │
│ -> enclave · verified      │
│    no plaintext egress     │
│                            │
└────────────────────────────┘

Analysis Environment

Where quants and research engineers write their pipelines. Jupyter-native, persistent compute inside NVIDIA Confidential Compute containers. Write once, run against any authorized dataset — your own corpora or the Marketplace's — without the data ever decrypting in plaintext.

— Where We Are

The build, in public.

A platform that asks institutions to hand over access to sensitive data earns trust by showing its work. Here's the sequence.

  1. Q4 2025  ·  shipped

    Technology evaluation complete.

    Cutting-edge privacy technologies evaluated, including NVIDIA Confidential Compute, attestation chains, and cryptographic kill-switches. Hardware reference architecture locked.

  2. Q1 2026  ·  shipped

    Early partnerships signed.

    Inaugural cohort of dataset partners onboarded under LOI: three biotechs, one health system, two university labs. Royalty mechanics and access-review workflows piloted against live data.

  3. Q2 2026  ·  current phase

    Alpha tenants & onboarding portal.

    The portal you're reading this page on. Tenant provisioning inside confidential-compute pools, first-wave admin seats, scope-aware OAuth handoff, signed access agreements. Closed alpha.

  4. Q3 – Q4 2026

    Beta launch.

    Open onboarding, public Research Portal release, Marketplace economics activated for the first cohort of dataset contributors, and an initial set of attested analysis pipelines available to all tenants.

— Call for Dataset Partners

Data that could unlock science, if it could be queried without being exposed.

We're talking to institutions sitting on datasets valuable enough that the institutional walls around them are blocking meaningful downstream work. If this is you, the onboarding below is also for you — the contribution path runs through the same tenant.

What we're looking for

  • Sensitive or regulated clinical cohorts and population-level data
  • Proprietary research corpora held under institutional NDAs
  • Longitudinal datasets with existing access friction
  • Specialized scientific corpora with active downstream demand

Benefits for early partners

  • Early access to the platform & shaping input on Marketplace economics
  • Preferred royalty terms on downstream derivative models
  • Dedicated engineering contact & white-glove tenant provisioning
  • Fee-free tenancy through the beta window
Begin onboarding → Same form, whether you're bringing workloads or bringing data.
— The Onboarding

Four steps.
Seventy-two hours.

The longest step is reading the access agreement — everything else is a form field, a checkbox, or a signature. Once submitted, our team handles OAuth handoff, confidential-compute tenant provisioning, and wrapper generation on our side.

  1. 01

    Create an enterprise account

    One admin credential per organization. You'll invite teammates later from the dashboard, with granular role-based scopes per seat.

  2. 02

    Authorize data sources

    Point us at repositories, document stores, or inventory systems. Read-only by default, scoped per source, revocable at any moment.

  3. 03

    Sign the access agreement

    Legally binding T&Cs covering data handling, retention windows, deletion rights, and our SOC 2 Type II security posture.

  4. 04

    Receive your wrapper

    Within 72 hours, a scoped generation of your framework, tailored to your stack and workflows. White-glove handoff from a named engineer.

A note on credentials. This portal captures an admin identity for your WikiGen tenant. For source access (GitHub repos, Drive folders, inventory APIs), we use OAuth delegation rather than stored passwords — you'll be redirected to each provider after signup. No long-lived credentials are ever at rest on our infrastructure.
01 · Account
02 · Sources
03 · Agreement
04 · Done

Create your admin account

This is the owner seat for your organization's WikiGen tenant. You can transfer it later.

Company name is required.
Please enter a valid business email.
Enter a password
Password must be at least 10 characters with a number and symbol.
Passwords don't match.
Step 1 of 3

Authorize your data sources

Select which systems WikiGen may read from. Each source will prompt for OAuth consent after signup — we never store long-lived credentials.

Select at least one source to continue.
We'll pre-configure your tenant to watch this repo. More can be added later.

Access & service agreement

Please review. This agreement governs WikiGen's access to the data sources you authorized in step 2.

1. Grant of Access

Customer grants WikiGen (the "Service Provider") a limited, non-exclusive, revocable right to read from the data sources explicitly authorized in the preceding step, solely for the purpose of generating the contracted wrapper interface. No write access is granted unless separately authorized.

2. Data Handling & Processing

All data accessed remains the property of Customer. WikiGen processes data in accordance with its published Data Processing Addendum, including encryption in transit (TLS 1.3) and at rest (AES-256). Data is processed in regions consistent with Customer's residency requirements where applicable.

3. Retention & Deletion

Customer may revoke access at any time via the admin dashboard or by written notice. Upon revocation, WikiGen will purge derived indexes within thirty (30) calendar days and provide written confirmation. Raw source data is never persisted beyond the active processing window except for embeddings used in the wrapper itself.

4. Confidentiality

Each party shall treat the other's Confidential Information with the same degree of care it uses for its own, and not less than reasonable care. WikiGen personnel are bound by individual NDAs and access-tier controls.

5. Security Posture

WikiGen maintains SOC 2 Type II certification (audited annually), conducts quarterly penetration tests by an independent third party, and maintains a documented incident response program with 24-hour initial disclosure to affected Customers.

6. Limitation of Liability

Each party's aggregate liability arising out of or related to this Agreement shall not exceed the fees paid by Customer in the twelve (12) months preceding the claim. Neither party shall be liable for indirect, consequential, or punitive damages.

7. Term & Termination

This Agreement commences on the date of electronic signature below and continues until terminated by either party with thirty (30) days written notice, or immediately for material breach uncured within fifteen (15) days of notice.

8. Governing Law

This Agreement shall be governed by the laws of the State of Delaware, USA, without regard to conflicts of law principles. Any disputes shall be resolved in the state or federal courts located in Wilmington, Delaware.

9. Entire Agreement

This document, together with the Data Processing Addendum and any Order Form, constitutes the entire agreement between the parties and supersedes all prior proposals or communications.

By typing your name, you electronically sign this agreement with the same legal effect as a handwritten signature (E-SIGN Act, 2000).
Signature is required.

You're in.

We've received your authorization. Our team will reach out within 24 hours with OAuth handoff links for each source and a calendar for your wrapper generation.

Organization
Admin email
Sources
Signed by
Agreement ID
Timestamp