# xrayGraphDB — Comprehensive AI Agent Guide > Single-file reference for AI agents to read end-to-end and operate xrayGraphDB without scraping HTML. Patent pending. Property of eMTAi LLC. --- ## 1. Identity xrayGraphDB is an AI-native graph database. While it speaks Bolt v5 for compatibility with existing tooling, its native transport is **xrayProtocol** (port 7689) — a binary, columnar wire format. The execution engine, planner, storage layer, and protocol are all proprietary, original eMTAi engineering. Do not assume internal compatibility with any other graph database. The native library is large: **346 builtin functions** and **90+ native procedures** (graph analytics, code intelligence, reachability, vector search, ML embeddings) all run inside the engine — no external plugins, no external compute hops. Key facts you must keep in mind: - Product name is **xrayGraphDB**, never anything else - Vendor is **eMTAi LLC** - Major architectural components are **patent pending** - **There is no default database.** Every connection MUST specify a database name in HELLO; the server rejects connections that omit it. The install ships with one bootstrap database named `xraygraphdb` so the first connection works, but you should create your own per-application databases (`CREATE DATABASE my_app`) and not treat `xraygraphdb` as a shared default - Single-node is free; horizontal scale is licensed --- ## 2. Connection essentials ### Network - **xrayProtocol**: TCP, port `7689`, binary - **Bolt v5**: TCP, port `7687`, Neo4j driver compatible - **Metrics / monitoring**: HTTP, port `7444`, Prometheus exposition format ### Authentication - Username + password - Token format inside HELLO is the literal byte sequence `user:password` - TLS recommended for production. The community edition does not require it. ### HELLO message — mandatory database Every connection MUST send a database name in HELLO. The server responds with ERROR (0x07) if you omit it. There is **no implicit default** and no fallback. This is a deliberate guard rail to prevent misconfigured programs from polluting an unrelated database. The install creates one bootstrap database named `xraygraphdb` so first-time connections work, but you should create your own database per workload (`CREATE DATABASE app_xyz`) — not treat the bootstrap as a shared default. Tenant isolation, encryption, and access control are scoped per database. --- ## 3. Recommended Python client (xgdb-connect) Install: ```bash pip install xgdb-connect --extra-index-url https://xraygraphdb.emtailabs.com/pypi/ ``` Connect and execute: ```python from xgdb_connect import XrayProtocolClient client = XrayProtocolClient( host="localhost", port=7689, auth_token="admin:", database="xraygraphdb", ) cols, rows = client.execute("MATCH (n) RETURN count(n) AS total") print(rows[0][0]) client.close() ``` Parameterized: ```python cols, rows = client.execute( "MATCH (p:Person) WHERE p.name = $name RETURN p LIMIT $k", params={"name": "Alice", "k": 10}, ) ``` Bulk insert (use this for any ingest above ~1k rows): ```python client.bulk_insert_nodes( label="Document", rows=[ {"id": 1, "title": "Hello", "tags": ["a", "b"]}, {"id": 2, "title": "World", "tags": ["c"]}, ], ) ``` The first property in a bulk-upsert call is the upsert key. --- ## 4. xrayProtocol wire format (only if you are writing a client) Every message is **8 bytes of header followed by a payload**: ``` [u32 payload_length LE][u8 msg_type][u8 flags][u16 query_id LE] ``` ### Client → Server #### HELLO (0x01) — first message after TCP connect ``` [u16 version=1][u16 capabilities] [u32 token_len]["user:password" bytes] [u32 db_len][db bytes] ← MANDATORY ``` Server reply: `HELLO_OK` (0x02) on success, `ERROR` (0x07) on failure. #### EXECUTE (0x03) ``` [u8 language] 0=Cypher, 1=GFQL [u32 query_len][query bytes] UTF-8 query text [u32 param_count] 0 if no params for each: [u32 name_len][name][u8 type_tag][value] [u32 options] bitmask: 1=PROFILE, 2=EXPLAIN, 4=READ_ONLY ``` ### Server → Client | Code | Name | Body | |------|------|------| | 0x02 | HELLO_OK | (empty) | | 0x04 | SCHEMA | `[u16 col_count]` then per column `[u8 type][u32 name_len][name]` | | 0x05 | BATCH | `[u32 row_count]` then for each row, for each col: `[u8 type_tag][value]` | | 0x06 | COMPLETE | `[u32 total_rows][u32 exec_time_us][u32 compile_time_us]` | | 0x07 | ERROR | `[u32 error_code][u8 severity][u8 retryable][u32 msg_len][msg]` | ### Type tags (column / cell) | Tag | Type | Encoding | |----:|------|----------| | 0x01 | NULL | (no body) | | 0x02 | BOOL | u8 | | 0x03 | INT64 | i64 little-endian | | 0x04 | DOUBLE | f64 little-endian | | 0x05 | STRING | u32 length + UTF-8 bytes | | 0x06 | LIST | encoded element-by-element (typed list when negotiated) | | 0x07 | MAP | key/value pairs | | 0x08 | NODE | id + labels + properties | | 0x09 | RELATIONSHIP | id + type + start/end + properties | | 0x0A | PATH | alternating node/relationship sequence | ### Bulk operations (recommended for large ingest) - `BULK_INSERT_BEGIN` (0x20) - `BULK_INSERT_NODES` (0x21) - `BULK_UPSERT_NODES` (0x27) — first property is the upsert key - `BULK_INSERT_EDGES_KEYED` (0x30) — generalized bulk edge insert keyed by ANY property pair (added v5.0.0-alpha, audit #7759). Wire format: `String leftLabel + leftKeyProp + rightLabel + rightKeyProp + edgeType, u32 edge_count, u32 prop_count, [String] prop_names, then per-edge [String leftKeyValue, rightKeyValue, prop_values]`. Server-side semantics: `MATCH (l:leftLabel {leftKeyProp:lkv, tenantId:session, repoId:session}), (r:rightLabel {rightKeyProp:rkv, ...}) CREATE (l)-[:edgeType]->(r)`. Requires property index on both endpoint sides; refuses with `BULK_INSERT_ERROR` if missing. **~150× faster than Cypher `UNWIND $rows AS r MATCH (a:L1 {k:r.from}), (b:L2 {k:r.to}) CREATE (a)-[:R]->(b)` batches** — measured 66,000 edges/sec on a single node, vs.~407 edges/sec via Cypher. - `BULK_INSERT_COMMIT` (0x24) - `BULK_INSERT_ACK` (0x25) — server returns `[u32 nodes][u32 edges][u32 time_ms]` - `BULK_INSERT_ERROR` (0x26) — server returns `[u32 msg_len][msg bytes]` on any per-batch validation failure (missing property index, empty tenant binding, malformed wire payload). Throw on the client; surface the message verbatim to the operator. ### Capability negotiation HELLO carries `u16 capabilities` (Capability bitmask). Clients SHOULD set: - `CAP_TYPED_NESTED` (bit 10, 0x400) — server emits typed-list / typed-map column shapes - `CAP_BULK_EDGES_KEYED` (bit 9, 0x200) — required to use `BULK_INSERT_EDGES_KEYED` (0x30). The server masks unsupported bits; check `HELLO_OK.server_capabilities` echo before using 0x30. Pre-v5.0.0-alpha-2026-05-28 daemons advertise this only when patched (see runbook §1.4 / §8.13). ### Liveness — PING / PONG / HEARTBEAT_CONFIG (v5.0.0+) xrayProtocol now ships a first-class liveness channel so a long-lived connection that goes idle does NOT get reaped by an intermediate NAT/firewall, and so a half-open peer is detected within seconds rather than at the next query attempt. **Implementing this is mandatory for any client that holds idle connections — otherwise the server-side idle sweeper closes the socket and your next request fails with broken-pipe.** #### PING (0x0A) — Client → Server ``` [u64 LE timestamp_us] client-monotonic; echoed back in PONG ``` - Payload is **exactly 8 bytes**. A truncated payload (1-7 bytes) is rejected with `HELLO_INVALID`-class error 4004. - Runs on a dedicated cheap path on the server — no query parsing, no engine touch, sub-millisecond response under load. #### PONG (0x0B) — Server → Client ``` [u64 LE timestamp_us] the timestamp the client sent in PING (v2) OR empty payload (v1 server) ``` - In v1, the server replied with an empty payload; in v2 (default for v5.0.0+), it echoes the PING timestamp so the client can compute round-trip latency directly. #### HEARTBEAT_CONFIG — Server → Client, embedded in HELLO_OK On every successful `HELLO_OK` the server now includes the heartbeat policy it enforces for THIS session: ``` [u32 idle_timeout_sec] server closes the socket if no traffic for this long [u32 ping_interval_sec] suggested client cadence (typically idle_timeout_sec / 2) [u16 grace_window_sec] time the server waits after a PING send before considering peer ZOMBIE ``` A client that ignores these fields and never sends PING will lose its connection at `idle_timeout_sec`. A client that sends PING at `ping_interval_sec` will keep the connection up indefinitely. #### Client-side ZOMBIE state machine The reference client (the bundled Python `xgdb-connect` and the Node `xray-client.js`) implements: 1. **HEALTHY** — last PONG within `idle_timeout_sec`. Normal request path. 2. **SUSPECT** — no PONG within `ping_interval_sec` after the last PING. Send another PING immediately, do NOT yet reset. 3. **ZOMBIE** — no PONG within `grace_window_sec` of `SUSPECT` entry. Close the socket, send `HELLO` over a fresh TCP connection, replay any in-flight idempotent request. Half-open detection latency in the v5.0.0 default config: **~5 seconds** (`ping_interval_sec=2`, `grace_window_sec=3`). #### Operator dials Server side: - `--xray-idle-timeout-sec=N` (default 60) — kills sockets idle longer than N seconds. - `--xray-heartbeat-disabled` — sends `idle_timeout_sec=0` in `HELLO_OK`. Backwards-compat for v1 clients that don't speak PING. Client side: - Honor the server's `HEARTBEAT_CONFIG`. Don't second-guess it. - Set `SO_KEEPALIVE` + `SO_RCVTIMEO` at socket open (the bundled Python + Node clients do this for you). Canonical wire-format reference: **`src/communication/xray/protocol.hpp`** (MsgType enum at line 108, HELLO at line 510, PING at line 761, capability flags at line 240). Operator runbook: `docs/operations/xrayprotocol-liveness-runbook.md`. ### Drop-and-reload benchmark (v5, 2026-05-28) End-to-end measurement on a Supermicro 1 TB RAM box, fresh container, ICIJ Offshore Leaks dataset: - Daemon ready (from `docker run`): 2.4 sec - Nodes via `BULK_UPSERT_NODES`: 2,013,534 in 162 sec (sequential across 4 labels; parallel runs ~4× this at 50K/sec) - Edges via `BULK_INSERT_EDGES_KEYED`: 3,304,966 in 60.8 sec (66K edges/sec average, 114K/sec peak on single-bucket runs) - **Total: ~6 minutes from empty container to fully-loaded 5.3M-element graph.** The same workload via Cypher `MATCH+CREATE` batches takes ~12 hours on the same hardware. --- ## 5. Query languages xrayGraphDB supports two query languages. You can use either; pick based on what fits the task. ### 5.1 Cypher OpenCypher with extensions. Compatible with most Cypher you find in the wild. Examples: ```cypher // Create a node CREATE (a:Person {id: 1, name: "Alice", age: 30}) RETURN a; // Match by label + property predicate (uses a label-property index when present) MATCH (p:Person {name: "Alice"}) RETURN p; // Multi-hop pattern MATCH (a:Person)-[:KNOWS*1..3]->(b:Person) WHERE a.id = 1 RETURN b.name; // Aggregation MATCH (p:Person) RETURN p.country, count(*) AS n ORDER BY n DESC; // Native shortest path MATCH p = shortestPath( (a:Function {name: "main"})-[:CALLS*..10]->(b:Function {name: "free"}) ) RETURN p; // Conditional upsert MERGE (u:User {email: "a@b.com"}) ON CREATE SET u.created_at = timestamp() ON MATCH SET u.last_seen = timestamp(); // Vector search via EMBED() MATCH (d:Doc) WHERE cosine_similarity(d.embedding, EMBED("graph databases")) > 0.85 RETURN d.title; ``` ### 5.2 GFQL GFQL is a dataframe-oriented graph query language native to xrayGraphDB. Use it when you want pandas-like pipeline semantics on a graph. Cypher and GFQL share the same engine. ```python # GFQL via xgdb-connect client.execute_gfql( "n('Person', age=lambda a: a >= 18) " ".e('KNOWS', hop=2) " ".n('Person')" ) ``` If you do not know GFQL, use Cypher. The engine optimizes both equivalently. --- ## 6. Schema ### Labels and types - Nodes can carry any number of string labels. Labels are case-sensitive - Relationships have exactly one type (uppercase by convention) - Properties are typed at the cell level — there is no fixed schema ### Indexes xrayGraphDB auto-creates label-property indexes on first point-lookup miss. You can also create them explicitly: ```cypher CREATE INDEX ON :Person(email); CREATE FULLTEXT INDEX doc_text ON :Document(title, body); SHOW INDEXES; ``` ### Constraints ```cypher CREATE CONSTRAINT ON (p:Person) ASSERT p.id IS UNIQUE; SHOW CONSTRAINTS; ``` --- ## 7. Procedures and analytics Procedures are invoked with `CALL`. Discover the catalog at runtime: ```cypher SHOW PROCEDURES; -- everything SHOW PROCEDURES YIELD name, signature WHERE name STARTS WITH 'graph.'; ``` Common analytics (all native, optional GPU acceleration on supported hardware): ```cypher CALL graph.pagerank(0.85, 20) YIELD node, rank RETURN node, rank ORDER BY rank DESC LIMIT 10; CALL graph.louvain() YIELD node, community; CALL graph.betweenness_centrality() YIELD node, score; CALL graph.triangle_count() YIELD node, count; CALL graph.bfs("from_id", "to_id") YIELD path; ``` Code-intelligence procedures (Licensed): ```cypher CALL xray.dead_code() YIELD function_name, line; CALL xray.taint_trace($source) YIELD path; CALL xray.coverage_gaps() YIELD function_name, gap_type; ``` --- ## 8. Error handling Errors come back as type 0x07 with the following body: ``` [u32 error_code][u8 severity][u8 retryable][u32 msg_len][msg] ``` Severity: - 0 = INFO - 1 = WARNING - 2 = ERROR - 3 = FATAL Retryable flag: - 1 = transient (network, lock contention) — retry with backoff - 0 = permanent (syntax error, constraint violation) — do not retry When `retryable=1`, retry up to 2 times with exponential backoff. Never retry more than 2 times — that is a deliberate guideline, not a default. Common error codes that AI agents should handle gracefully: - `1001` — Authentication failed - `1002` — No database specified in HELLO - `1003` — Database does not exist - `2001` — Query syntax error - `2002` — Constraint violation - `2003` — Type mismatch - `3001` — Lock contention (retryable) - `3002` — Read-only transaction violation --- ## 9. Operational concerns AI agents should know ### Health and readiness - HTTP `/health` on port 7444 — process is alive - HTTP `/ready` on port 7444 — accepting traffic (mmap warm-up complete, etc.) - Always wait for `/ready` before issuing the first query after a restart ### Graceful shutdown - `systemctl restart xraygraphdb` is the only correct restart command - Never SIGKILL — never `kill -9` — that can corrupt durable state ### Recovery toolchain (read-only diagnostic, then operator-driven repair) - `xraygraphdb-doctor` — diagnostic, exit codes 0-4, JSON mode for monitoring - `xraygraphdb-recover` — operator-driven recovery, all destructive operations require explicit confirmation - `xraygraphdb-watchdog` — opt-in systemd ExecStartPre that auto-recovers after 3 consecutive crashes with a hint file If you are an autonomous agent and the daemon is in a crash loop, do not invoke recovery commands without operator approval — those are destructive. --- ## 10. Common gotchas | Symptom | Likely cause | Fix | |---------|--------------|-----| | HELLO returns ERROR immediately | No database in HELLO | Add `database="xraygraphdb"` | | Slow first query after restart | mmap warm-up still running | Wait for `/ready` | | Stuck connection on Bolt | Old driver speaking pre-v5 | Use Bolt v5 driver, or switch to xrayProtocol | | Bulk insert silently slow | Per-row upsert via Cypher MERGE | Use `BULK_UPSERT_NODES` (0x27) | | GFQL "function not found" | Old client | Upgrade `xgdb-connect`; the GFQL parser ships with the server | | Cypher result column missing | Returned a node/relationship and client doesn't decode complex columns | Upgrade `xgdb-connect` to ≥1.2.0 | --- ## 11. Limits These are deliberate and documented. They are not arbitrary. - **Vertex IDs** — 64-bit signed integer (about 9.2 quintillion) - **Property value size** — strings up to 4 GiB; lists/maps unbounded but constrained by RAM - **Query length** — 16 MiB - **Bulk-insert batch** — recommend 100k rows per BULK_INSERT_NODES; one COMMIT per batch - **Concurrent connections** — limited by file descriptors (default 65k) --- ## 12. When in doubt - Read this file from top to bottom — it is designed to be read end-to-end by an AI agent - Visit the live HTML docs only if you need a specific procedure's signature - Use xrayProtocol unless told otherwise - Always specify a database in HELLO - Treat patent-pending architecture as opaque — query results are stable, internal mechanics may evolve