# xrayGraphDB — Comprehensive AI Agent Guide

> Single-file reference for AI agents to read end-to-end and operate xrayGraphDB without scraping HTML. Patent pending. Property of eMTAi LLC.

---

## 1. Identity

xrayGraphDB is an AI-native graph database. While it speaks Bolt v5 for compatibility with existing tooling, its native transport is **xrayProtocol** (port 7689) — a binary, columnar wire format. The execution engine, planner, storage layer, and protocol are all proprietary, original eMTAi engineering. Do not assume internal compatibility with any other graph database.

The native library is large: **346 builtin functions** and **90+ native procedures** (graph analytics, code intelligence, reachability, vector search, ML embeddings) all run inside the engine — no external plugins, no external compute hops.

Key facts you must keep in mind:
- Product name is **xrayGraphDB**, never anything else
- Vendor is **eMTAi LLC**
- Major architectural components are **patent pending**
- **There is no default database.** Every connection MUST specify a database name in HELLO; the server rejects connections that omit it. The install ships with one bootstrap database named `xraygraphdb` so the first connection works, but you should create your own per-application databases (`CREATE DATABASE my_app`) and not treat `xraygraphdb` as a shared default
- Single-node is free; horizontal scale is licensed

---

## 2. Connection essentials

### Network
- **xrayProtocol**: TCP, port `7689`, binary
- **Bolt v5**: TCP, port `7687`, Neo4j driver compatible
- **Metrics / monitoring**: HTTP, port `7444`, Prometheus exposition format

### Authentication
- Username + password
- Token format inside HELLO is the literal byte sequence `user:password`
- TLS recommended for production. The community edition does not require it.

### HELLO message — mandatory database
Every connection MUST send a database name in HELLO. The server responds with ERROR (0x07) if you omit it. There is **no implicit default** and no fallback. This is a deliberate guard rail to prevent misconfigured programs from polluting an unrelated database.

The install creates one bootstrap database named `xraygraphdb` so first-time connections work, but you should create your own database per workload (`CREATE DATABASE app_xyz`) — not treat the bootstrap as a shared default. Tenant isolation, encryption, and access control are scoped per database.

---

## 3. Recommended Python client (xgdb-connect)

Install:

```bash
pip install xgdb-connect --extra-index-url https://xraygraphdb.emtailabs.com/pypi/
```

Connect and execute:

```python
from xgdb_connect import XrayProtocolClient

client = XrayProtocolClient(
    host="localhost",
    port=7689,
    auth_token="admin:<your-password>",
    database="xraygraphdb",
)

cols, rows = client.execute("MATCH (n) RETURN count(n) AS total")
print(rows[0][0])

client.close()
```

Parameterized:

```python
cols, rows = client.execute(
    "MATCH (p:Person) WHERE p.name = $name RETURN p LIMIT $k",
    params={"name": "Alice", "k": 10},
)
```

Bulk insert (use this for any ingest above ~1k rows):

```python
client.bulk_insert_nodes(
    label="Document",
    rows=[
        {"id": 1, "title": "Hello", "tags": ["a", "b"]},
        {"id": 2, "title": "World", "tags": ["c"]},
    ],
)
```

The first property in a bulk-upsert call is the upsert key.

---

## 4. xrayProtocol wire format (only if you are writing a client)

Every message is **8 bytes of header followed by a payload**:

```
[u32 payload_length LE][u8 msg_type][u8 flags][u16 query_id LE]
```

### Client → Server

#### HELLO (0x01) — first message after TCP connect
```
[u16 version=1][u16 capabilities]
[u32 token_len]["user:password" bytes]
[u32 db_len][db bytes]   ← MANDATORY
```

Server reply: `HELLO_OK` (0x02) on success, `ERROR` (0x07) on failure.

#### EXECUTE (0x03)
```
[u8 language]                     0=Cypher, 1=GFQL
[u32 query_len][query bytes]      UTF-8 query text
[u32 param_count]                 0 if no params
  for each: [u32 name_len][name][u8 type_tag][value]
[u32 options]                     bitmask: 1=PROFILE, 2=EXPLAIN, 4=READ_ONLY
```

### Server → Client

| Code | Name | Body |
|------|------|------|
| 0x02 | HELLO_OK | (empty) |
| 0x04 | SCHEMA | `[u16 col_count]` then per column `[u8 type][u32 name_len][name]` |
| 0x05 | BATCH | `[u32 row_count]` then for each row, for each col: `[u8 type_tag][value]` |
| 0x06 | COMPLETE | `[u32 total_rows][u32 exec_time_us][u32 compile_time_us]` |
| 0x07 | ERROR | `[u32 error_code][u8 severity][u8 retryable][u32 msg_len][msg]` |

### Type tags (column / cell)

| Tag | Type | Encoding |
|----:|------|----------|
| 0x01 | NULL | (no body) |
| 0x02 | BOOL | u8 |
| 0x03 | INT64 | i64 little-endian |
| 0x04 | DOUBLE | f64 little-endian |
| 0x05 | STRING | u32 length + UTF-8 bytes |
| 0x06 | LIST | encoded element-by-element (typed list when negotiated) |
| 0x07 | MAP | key/value pairs |
| 0x08 | NODE | id + labels + properties |
| 0x09 | RELATIONSHIP | id + type + start/end + properties |
| 0x0A | PATH | alternating node/relationship sequence |

### Bulk operations (recommended for large ingest)
- `BULK_INSERT_BEGIN` (0x20)
- `BULK_INSERT_NODES` (0x21)
- `BULK_UPSERT_NODES` (0x27) — first property is the upsert key
- `BULK_INSERT_EDGES_KEYED` (0x30) — generalized bulk edge insert keyed by ANY property pair (added v5.0.0-alpha, audit #7759). Wire format: `String leftLabel + leftKeyProp + rightLabel + rightKeyProp + edgeType, u32 edge_count, u32 prop_count, [String] prop_names, then per-edge [String leftKeyValue, rightKeyValue, prop_values]`. Server-side semantics: `MATCH (l:leftLabel {leftKeyProp:lkv, tenantId:session, repoId:session}), (r:rightLabel {rightKeyProp:rkv, ...}) CREATE (l)-[:edgeType]->(r)`. Requires property index on both endpoint sides; refuses with `BULK_INSERT_ERROR` if missing. **~150× faster than Cypher `UNWIND $rows AS r MATCH (a:L1 {k:r.from}), (b:L2 {k:r.to}) CREATE (a)-[:R]->(b)` batches** — measured 66,000 edges/sec on a single node, vs.~407 edges/sec via Cypher.
- `BULK_INSERT_COMMIT` (0x24)
- `BULK_INSERT_ACK` (0x25) — server returns `[u32 nodes][u32 edges][u32 time_ms]`
- `BULK_INSERT_ERROR` (0x26) — server returns `[u32 msg_len][msg bytes]` on any per-batch validation failure (missing property index, empty tenant binding, malformed wire payload). Throw on the client; surface the message verbatim to the operator.

### Capability negotiation
HELLO carries `u16 capabilities` (Capability bitmask). Clients SHOULD set:
- `CAP_TYPED_NESTED` (bit 10, 0x400) — server emits typed-list / typed-map column shapes
- `CAP_BULK_EDGES_KEYED` (bit 9, 0x200) — required to use `BULK_INSERT_EDGES_KEYED` (0x30). The server masks unsupported bits; check `HELLO_OK.server_capabilities` echo before using 0x30. Pre-v5.0.0-alpha-2026-05-28 daemons advertise this only when patched (see runbook §1.4 / §8.13).

### Liveness — PING / PONG / HEARTBEAT_CONFIG (v5.0.0+)

xrayProtocol now ships a first-class liveness channel so a long-lived connection that goes idle does NOT get reaped by an intermediate NAT/firewall, and so a half-open peer is detected within seconds rather than at the next query attempt. **Implementing this is mandatory for any client that holds idle connections — otherwise the server-side idle sweeper closes the socket and your next request fails with broken-pipe.**

#### PING (0x0A) — Client → Server
```
[u64 LE timestamp_us]   client-monotonic; echoed back in PONG
```
- Payload is **exactly 8 bytes**. A truncated payload (1-7 bytes) is rejected with `HELLO_INVALID`-class error 4004.
- Runs on a dedicated cheap path on the server — no query parsing, no engine touch, sub-millisecond response under load.

#### PONG (0x0B) — Server → Client
```
[u64 LE timestamp_us]   the timestamp the client sent in PING (v2)
                        OR empty payload (v1 server)
```
- In v1, the server replied with an empty payload; in v2 (default for v5.0.0+), it echoes the PING timestamp so the client can compute round-trip latency directly.

#### HEARTBEAT_CONFIG — Server → Client, embedded in HELLO_OK
On every successful `HELLO_OK` the server now includes the heartbeat policy it enforces for THIS session:
```
[u32 idle_timeout_sec]     server closes the socket if no traffic for this long
[u32 ping_interval_sec]    suggested client cadence (typically idle_timeout_sec / 2)
[u16 grace_window_sec]     time the server waits after a PING send before considering peer ZOMBIE
```
A client that ignores these fields and never sends PING will lose its connection at `idle_timeout_sec`. A client that sends PING at `ping_interval_sec` will keep the connection up indefinitely.

#### Client-side ZOMBIE state machine
The reference client (the bundled Python `xgdb-connect` and the Node `xray-client.js`) implements:
1. **HEALTHY** — last PONG within `idle_timeout_sec`. Normal request path.
2. **SUSPECT** — no PONG within `ping_interval_sec` after the last PING. Send another PING immediately, do NOT yet reset.
3. **ZOMBIE** — no PONG within `grace_window_sec` of `SUSPECT` entry. Close the socket, send `HELLO` over a fresh TCP connection, replay any in-flight idempotent request.

Half-open detection latency in the v5.0.0 default config: **~5 seconds** (`ping_interval_sec=2`, `grace_window_sec=3`).

#### Operator dials
Server side:
- `--xray-idle-timeout-sec=N` (default 60) — kills sockets idle longer than N seconds.
- `--xray-heartbeat-disabled` — sends `idle_timeout_sec=0` in `HELLO_OK`. Backwards-compat for v1 clients that don't speak PING.

Client side:
- Honor the server's `HEARTBEAT_CONFIG`. Don't second-guess it.
- Set `SO_KEEPALIVE` + `SO_RCVTIMEO` at socket open (the bundled Python + Node clients do this for you).

Canonical wire-format reference: **`src/communication/xray/protocol.hpp`** (MsgType enum at line 108, HELLO at line 510, PING at line 761, capability flags at line 240). Operator runbook: `docs/operations/xrayprotocol-liveness-runbook.md`.

### Drop-and-reload benchmark (v5, 2026-05-28)

End-to-end measurement on a Supermicro 1 TB RAM box, fresh container, ICIJ Offshore Leaks dataset:
- Daemon ready (from `docker run`): 2.4 sec
- Nodes via `BULK_UPSERT_NODES`: 2,013,534 in 162 sec (sequential across 4 labels; parallel runs ~4× this at 50K/sec)
- Edges via `BULK_INSERT_EDGES_KEYED`: 3,304,966 in 60.8 sec (66K edges/sec average, 114K/sec peak on single-bucket runs)
- **Total: ~6 minutes from empty container to fully-loaded 5.3M-element graph.**
The same workload via Cypher `MATCH+CREATE` batches takes ~12 hours on the same hardware.

---

## 5. Query languages

xrayGraphDB supports two query languages. You can use either; pick based on what fits the task.

### 5.1 Cypher

OpenCypher with extensions. Compatible with most Cypher you find in the wild. Examples:

```cypher
// Create a node
CREATE (a:Person {id: 1, name: "Alice", age: 30}) RETURN a;

// Match by label + property predicate (uses a label-property index when present)
MATCH (p:Person {name: "Alice"}) RETURN p;

// Multi-hop pattern
MATCH (a:Person)-[:KNOWS*1..3]->(b:Person)
WHERE a.id = 1
RETURN b.name;

// Aggregation
MATCH (p:Person) RETURN p.country, count(*) AS n ORDER BY n DESC;

// Native shortest path
MATCH p = shortestPath(
  (a:Function {name: "main"})-[:CALLS*..10]->(b:Function {name: "free"})
) RETURN p;

// Conditional upsert
MERGE (u:User {email: "a@b.com"})
ON CREATE SET u.created_at = timestamp()
ON MATCH  SET u.last_seen  = timestamp();

// Vector search via EMBED()
MATCH (d:Doc) WHERE cosine_similarity(d.embedding, EMBED("graph databases")) > 0.85
RETURN d.title;
```

### 5.2 GFQL

GFQL is a dataframe-oriented graph query language native to xrayGraphDB. Use it when you want pandas-like pipeline semantics on a graph. Cypher and GFQL share the same engine.

```python
# GFQL via xgdb-connect
client.execute_gfql(
    "n('Person', age=lambda a: a >= 18) "
    ".e('KNOWS', hop=2) "
    ".n('Person')"
)
```

If you do not know GFQL, use Cypher. The engine optimizes both equivalently.

---

## 6. Schema

### Labels and types
- Nodes can carry any number of string labels. Labels are case-sensitive
- Relationships have exactly one type (uppercase by convention)
- Properties are typed at the cell level — there is no fixed schema

### Indexes
xrayGraphDB auto-creates label-property indexes on first point-lookup miss. You can also create them explicitly:

```cypher
CREATE INDEX ON :Person(email);
CREATE FULLTEXT INDEX doc_text ON :Document(title, body);
SHOW INDEXES;
```

### Constraints
```cypher
CREATE CONSTRAINT ON (p:Person) ASSERT p.id IS UNIQUE;
SHOW CONSTRAINTS;
```

---

## 7. Procedures and analytics

Procedures are invoked with `CALL`. Discover the catalog at runtime:

```cypher
SHOW PROCEDURES;          -- everything
SHOW PROCEDURES YIELD name, signature WHERE name STARTS WITH 'graph.';
```

Common analytics (all native, optional GPU acceleration on supported hardware):

```cypher
CALL graph.pagerank(0.85, 20) YIELD node, rank RETURN node, rank ORDER BY rank DESC LIMIT 10;
CALL graph.louvain() YIELD node, community;
CALL graph.betweenness_centrality() YIELD node, score;
CALL graph.triangle_count() YIELD node, count;
CALL graph.bfs("from_id", "to_id") YIELD path;
```

Code-intelligence procedures (Licensed):

```cypher
CALL xray.dead_code() YIELD function_name, line;
CALL xray.taint_trace($source) YIELD path;
CALL xray.coverage_gaps() YIELD function_name, gap_type;
```

---

## 8. Error handling

Errors come back as type 0x07 with the following body:

```
[u32 error_code][u8 severity][u8 retryable][u32 msg_len][msg]
```

Severity:
- 0 = INFO
- 1 = WARNING
- 2 = ERROR
- 3 = FATAL

Retryable flag:
- 1 = transient (network, lock contention) — retry with backoff
- 0 = permanent (syntax error, constraint violation) — do not retry

When `retryable=1`, retry up to 2 times with exponential backoff. Never retry more than 2 times — that is a deliberate guideline, not a default.

Common error codes that AI agents should handle gracefully:
- `1001` — Authentication failed
- `1002` — No database specified in HELLO
- `1003` — Database does not exist
- `2001` — Query syntax error
- `2002` — Constraint violation
- `2003` — Type mismatch
- `3001` — Lock contention (retryable)
- `3002` — Read-only transaction violation

---

## 9. Operational concerns AI agents should know

### Health and readiness
- HTTP `/health` on port 7444 — process is alive
- HTTP `/ready` on port 7444 — accepting traffic (mmap warm-up complete, etc.)
- Always wait for `/ready` before issuing the first query after a restart

### Graceful shutdown
- `systemctl restart xraygraphdb` is the only correct restart command
- Never SIGKILL — never `kill -9` — that can corrupt durable state

### Recovery toolchain (read-only diagnostic, then operator-driven repair)
- `xraygraphdb-doctor` — diagnostic, exit codes 0-4, JSON mode for monitoring
- `xraygraphdb-recover` — operator-driven recovery, all destructive operations require explicit confirmation
- `xraygraphdb-watchdog` — opt-in systemd ExecStartPre that auto-recovers after 3 consecutive crashes with a hint file

If you are an autonomous agent and the daemon is in a crash loop, do not invoke recovery commands without operator approval — those are destructive.

---

## 10. Common gotchas

| Symptom | Likely cause | Fix |
|---------|--------------|-----|
| HELLO returns ERROR immediately | No database in HELLO | Add `database="xraygraphdb"` |
| Slow first query after restart | mmap warm-up still running | Wait for `/ready` |
| Stuck connection on Bolt | Old driver speaking pre-v5 | Use Bolt v5 driver, or switch to xrayProtocol |
| Bulk insert silently slow | Per-row upsert via Cypher MERGE | Use `BULK_UPSERT_NODES` (0x27) |
| GFQL "function not found" | Old client | Upgrade `xgdb-connect`; the GFQL parser ships with the server |
| Cypher result column missing | Returned a node/relationship and client doesn't decode complex columns | Upgrade `xgdb-connect` to ≥1.2.0 |

---

## 11. Limits

These are deliberate and documented. They are not arbitrary.

- **Vertex IDs** — 64-bit signed integer (about 9.2 quintillion)
- **Property value size** — strings up to 4 GiB; lists/maps unbounded but constrained by RAM
- **Query length** — 16 MiB
- **Bulk-insert batch** — recommend 100k rows per BULK_INSERT_NODES; one COMMIT per batch
- **Concurrent connections** — limited by file descriptors (default 65k)

---

## 12. When in doubt

- Read this file from top to bottom — it is designed to be read end-to-end by an AI agent
- Visit the live HTML docs only if you need a specific procedure's signature
- Use xrayProtocol unless told otherwise
- Always specify a database in HELLO
- Treat patent-pending architecture as opaque — query results are stable, internal mechanics may evolve