Per-Tenant Cryptographic Isolation
Every other graph database uses namespace filtering. xrayGraphDB gives each tenant its own encryption key derived from the storage epoch. This is cryptographic isolation, not access-control theater.
Tenant A Tenant B | | [Key A] [Key B] derived from derived from storage epoch storage epoch | | v v +------------------+ +------------------+ | Encrypted | | Encrypted | | Segment A | | Segment B | | AES-256-GCM | | AES-256-GCM | +------------------+ +------------------+ | | +--- Volume Epoch Binding ---+ prevents cloning across deployments
Military-Certifiable 5-Layer Defense
Not a single layer of security bolted on after the fact. Five interlocking defenses designed together, each one closing a gap the others cannot.
What This Means in Practice
- A database admin with root access to Tenant A cannot decrypt Tenant B's data — even with a disk image
- Cloning a volume to another machine yields unreadable ciphertext because the epoch changes
- Key rotation per tenant without downtime or re-encryption of the entire store
- Meets FIPS 140-2, SOC 2 Type II, and HIPAA encryption requirements at the storage layer
How Competitors Handle Multi-Tenancy
- Neo4j: Separate database per tenant. No per-tenant encryption. Shared keyspace.
- Memgraph: No multi-tenant support at all. Single-database only.
- KuzuDB: Embedded only. No encryption. No multi-tenancy.
- GalaxyDB: Namespace filtering. Shared encryption key. One key = one breach.
Ground-Up Rewrite
Not a fork with patches. A complete rewrite of the execution engine, planner, storage layer, and wire protocol. Every line is original eMTAi code.
Query Text | AST Fingerprint | Plan Cache ------> HIT? 0.2ms return | v (MISS) Planner | Vectorized Executor +----------------------------+ | DataChunk [1024 tuples] | | Column 0: node_ids int64 | | Column 1: names string | | Column 2: scores f64 | +----------------------------+ | xrayProtocol (columnar, LZ4) | Client (24x faster than Bolt)
Six Pillars of Performance
- 01 DataChunk Pipeline Operators process batches of 1,024 tuples at a time, maximizing CPU cache utilization and SIMD opportunities. Column-oriented, not row-at-a-time.
- 02 xrayProtocol — Columnar Wire Format Results stream column-by-column with LZ4 compression. 24x throughput versus row-based Bolt. Zero serialization overhead on the hot path.
- 03 Sorted Flat-Vector PropertyStore Properties stored in contiguous, sorted flat vectors. Binary search lookup with O(log n) access. Zero pointer chasing, zero heap fragmentation.
- 04 Lock-Free Adjacency Lists Segmented atomic fetch_add design for concurrent writes without reader locks. Writers never block readers. MVCC-compatible.
- 05 PMR Arena Allocators Per-query memory arenas with deterministic cleanup. No garbage collector pauses, no fragmentation, no leaks. Allocation cost approaches zero.
- 06 Plan Cache with AST Fingerprinting 425x speedup on repeated queries. Automatic invalidation on schema changes. Parameterized queries hit cache on first execution.
Cypher + GFQL + Neo4j Compatibility
Write Cypher as you know it. Use GFQL when dataframe-native syntax fits better. Neo4j-specific queries work automatically with zero changes.
Full Cypher with Neo4j Syntax Rewrites
xrayGraphDB automatically detects and rewrites Neo4j-specific syntax to standard Cypher, so applications migrating from Neo4j work without code changes.
-
CREATE INDEX — Neo4j's
CREATE INDEX FORsyntax auto-detected - SHOW PROCEDURES — returns xrayGraphDB procedures in Neo4j-compatible format
- shortestPath() — native BFS with bitset-based visited tracking
- Bolt v5 — full protocol compatibility with Neo4j 5.x drivers
// Works identically to Neo4j CREATE INDEX function_name_idx FOR (n:Function) ON (n.name); // Neo4j-compatible procedure listing SHOW PROCEDURES; // Native BFS shortest path MATCH p = shortestPath( (a:Function {name: "main"}) -[:CALLS*..10]-> (b:Function {name: "render"}) ) RETURN p;
// GFQL: Graph Frame Query Language // Dataframe-native graph queries SET GFQL_CONTEXT tenant='acme-corp'; FROM nodes(label='Function') .filter(complexity > 10) .hop(edge_type='CALLS', depth=3) .groupby('module') .agg(count=count(), avg_cx=avg('complexity')) .sort('avg_cx', desc=true) .limit(20);
GFQL as a First-Class Citizen
GFQL is not transpiled to Cypher. It has its own parser, planner, and executor pipeline that operates directly on DataChunks. Perfect for data scientists who think in dataframes.
- SET GFQL_CONTEXT — tenant-scoped sessions with automatic isolation
- hop() — multi-hop traversal with depth control and edge filtering
- groupby/agg — vectorized aggregation on graph result sets
- Composable — chain any number of operations in a single expression
9 Native Code Intelligence Procedures
Built-in graph procedures designed for codebase analysis. Not plugins — native C++ implementations that run inside the vectorized engine.
xray.dead_code()— find unreachable functionsxray.complexity()— cyclomatic complexity per functionxray.blast_radius()— impact analysis for changesxray.taint_trace()— track data flow through the graphxray.coverage_gaps()— untested code pathsxray.dependency_cycles()— circular dependency detection
EMBED() for Vector Operations
Native vector embedding support directly in query expressions. Store, index, and query high-dimensional vectors without external plugins or separate systems.
// Store embedding on a node MATCH (f:Function {name: "parse"}) SET f.embedding = EMBED("function that parses input tokens"); // Find semantically similar functions MATCH (f:Function) WHERE cosine_similarity( f.embedding, EMBED("parsing logic") ) > 0.85 RETURN f.name, f.module;