Seventeen Ways to Ask a Graph
A single dispatcher, a uniform three-method contract, and the algorithm that won the BEAM benchmark.
What does "graph completion" even mean when a graph is the wrong shape for the question?
Cognee ships with seventeen SearchType values. Seventeen. That sounds like redundancy. It is not. The seventeen values are five paradigms — graph-traversal, vector-only, hybrid, code/time-specialized, and agentic — unified behind a single three-method contract on BaseRetriever. The default, GRAPH_COMPLETION, is not a graph traversal at all. It is a *brute-force triplet search*: parallel vector search across five collections, an in-memory graph projection, distance-mapped re-ranking, and a top-k cutoff. That algorithm is what beat the BEAM benchmark. The other sixteen search types are variations on the theme of "use the right retrieval for the right question."
I expected seventeen search types to be redundancy. After mapping them to retrieval paradigms, the redundancy is the design.
Key Takeaways
- All seventeen
SearchTypevalues conform to a 3-method contract:get_retrieved_objects→get_context_from_objects→get_completion_from_context - The default
GRAPH_COMPLETIONis a brute-force triplet search, not a graph traversal — it embeds the query, runs parallel vector search across 5 collections, projects a graph fragment, then re-ranks bytriplet_distance_penalty=6.5 FEELING_LUCKYlets the LLM pick a search type at runtime — this is only possible because of the uniform contractHYBRID_COMPLETIONadds a "truth subspace" axis beyond vector similarity — a separate scoring signal that biases toward context-aligned answersAGENTIC_COMPLETIONis a ReAct-style agent loop with per-tool ACL, not just a retriever- Permission denials return empty lists, not errors — a deliberate information-leak prevention
The seventeen, grouped by paradigm
# From cognee/modules/search/types/SearchType.py
class SearchType(str, Enum):
SUMMARIES = "SUMMARIES"
CHUNKS = "CHUNKS"
CHUNKS_LEXICAL = "CHUNKS_LEXICAL"
RAG_COMPLETION = "RAG_COMPLETION"
HYBRID_COMPLETION = "HYBRID_COMPLETION"
TRIPLET_COMPLETION = "TRIPLET_COMPLETION"
GRAPH_COMPLETION = "GRAPH_COMPLETION"
GRAPH_COMPLETION_DECOMPOSITION = "GRAPH_COMPLETION_DECOMPOSITION"
GRAPH_SUMMARY_COMPLETION = "GRAPH_SUMMARY_COMPLETION"
GRAPH_COMPLETION_COT = "GRAPH_COMPLETION_COT"
GRAPH_COMPLETION_CONTEXT_EXTENSION = "GRAPH_COMPLETION_CONTEXT_EXTENSION"
CYPHER = "CYPHER"
NATURAL_LANGUAGE = "NATURAL_LANGUAGE"
TEMPORAL = "TEMPORAL"
FEELING_LUCKY = "FEELING_LUCKY"
CODING_RULES = "CODING_RULES"
AGENTIC_COMPLETION = "AGENTIC_COMPLETION"
| Paradigm | Search types | What it is | |----------|--------------|------------| | Vector-only | CHUNKS, CHUNKS_LEXICAL, SUMMARIES, RAG_COMPLETION | Traditional RAG: embed the query, retrieve top-k, optionally LLM-complete. No graph. | | Graph-traversal | TRIPLET_COMPLETION, GRAPH_COMPLETION, GRAPH_SUMMARY_COMPLETION, GRAPH_COMPLETION_COT, GRAPH_COMPLETION_DECOMPOSITION, GRAPH_COMPLETION_CONTEXT_EXTENSION | The default paradigm. Use the graph; pick a flavor. | | Hybrid | HYBRID_COMPLETION | Multi-channel (chunks + entities + facts + global context) with a "truth subspace" scoring axis. | | Specialized | CYPHER, NATURAL_LANGUAGE, TEMPORAL, CODING_RULES | Domain-specific: Cypher queries, NL→Cypher translation, time-aware, code-rules. | | Agentic | AGENTIC_COMPLETION | ReAct-style agent loop with skills + tools, gated by per-tool ACL. | | Router | FEELING_LUCKY | Lets the LLM pick a search type at runtime. |
Imagine you call cognee.recall("What did the user say about billing?"). The router picks GRAPH_COMPLETION (or, if you set auto_route=True and skip query_type, the LLM picks via FEELING_LUCKY). What happens next is the same regardless of which paradigm you chose.
The three-method contract
Every retriever in cognee subclasses BaseRetriever (cognee/modules/retrieval/base_retriever.py:5-118) and implements three methods:
class BaseRetriever(ABC):
async def get_retrieved_objects(self, query) -> list: ...
async def get_context_from_objects(self, query, retrieved_objects) -> str | list: ...
async def get_completion_from_context(self, query, retrieved_objects, context) -> str: ...
This contract is the single most important abstraction in the retrieval layer. The dispatcher (cognee/modules/search/methods/get_search_type_retriever_instance.py:38-389) holds a registry — search_core_registry — that maps each SearchType to a (RetrieverClass, init_kwargs_dict) pair. The retriever factory instantiates the right class and the dispatcher calls the three methods in order.
The contract's power is that it makes FEELING_LUCKY possible. When the dispatcher receives a FEELING_LUCKY query, it calls select_search_type(query_text) (in cognee/modules/search/operations/select_search_type.py:9-42) which asks the LLM to pick a SearchType from the enum. Then it falls through to the same dispatcher logic as if the user had named the type explicitly. The dynamic router is only possible because the contract is uniform — otherwise every retriever would need its own dispatch logic and the LLM's pick would be an un-routable string.
The default algorithm: brute-force triplet search
GRAPH_COMPLETION is the default and the algorithm is worth a close read. The function is brute_force_triplet_search in cognee/modules/retrieval/utils/brute_force_triplet_search.py:217-355. The flow:
1. Embed the query and run vector search across five collections in parallel via NodeEdgeVectorSearch (node_edge_vector_search.py:13-213): - Entity_name — entity names - TextSummary_text — pre-computed document summaries - EntityType_name — entity types - DocumentChunk_text — the original chunk text - EdgeType_relationship_name — relationship labels (always appended if missing) 2. Project a graph fragment — get_memory_fragment (lines 50-117). Two modes: - Wide projection (default): filters the in-memory graph to the top wide_search_top_k=100 node IDs from the vector results - Neighborhood projection: when neighborhood_depth is set, expands k-hop from seed IDs via project_neighborhood_from_db. Expansion nodes get re-scored via an ID-filtered vector search so they participate in triplet ranking 3. Map distances to graph nodes and edges via map_vector_distances_to_graph_nodes and map_vector_distances_to_graph_edges 4. Rank triplets via calculate_top_triplet_importances — returns top-k triplets. The triplet_distance_penalty=6.5 (the default) penalizes triplets where endpoints have no vector match. The feedback_influence parameter weights per-node feedback_weight scores.
flowchart TD
Q["Query: 'What did the user say about billing?'"] --> E["Embed query"]
E --> S1["Parallel vector search:<br/>Entity_name<br/>TextSummary_text<br/>EntityType_name<br/>DocumentChunk_text<br/>EdgeType_relationship_name"]
S1 --> P["get_memory_fragment<br/>(wide or neighborhood projection)"]
P --> M["Map distances to nodes/edges"]
M --> R["Rank triplets<br/>(triplet_distance_penalty=6.5)"]
R --> T["Top-k triplets"]
T --> C["Context: 'Nodes: ... Connections: ...'"]
C --> LLM["LLM completion<br/>via LLMGateway"]
LLM --> A["Final answer"]
The crucial detail is the projection step. The graph database (Ladybug, Neo4j, Postgres, Neptune — chapter three) is queried to load a *subgraph* into memory, and the vector distances from step 1 are mapped onto the subgraph's nodes and edges. This is the trick that makes the algorithm "graph completion" rather than "vector search": the retrieved vector hits are not the answer, they are the *seeds* for a graph re-ranking. A node that has no direct vector match but is connected to a high-scoring node can still surface if its endpoints score well together.
When brute-force is wrong: the other graph flavors
GRAPH_COMPLETION is the default but it is not always the right choice. The codebase ships five other graph flavors:
TRIPLET_COMPLETION— searches the pre-computedTriplet_textvector collection directly, no projection. Faster, less precise. Triplets are materialized inadd_data_points.py:202-283with a Unicode arrow delimiter:"{source_text} -› {edge_text}-›{target_text}". The arrow character is load-bearing — retrievers split on it.GRAPH_SUMMARY_COMPLETION— same asGRAPH_COMPLETIONbut the retrieved edge text is summarized before the LLM step. Useful when the graph is dense.GRAPH_COMPLETION_COT— chain-of-thought. Validates the answer, generates follow-up questions, re-retrieves, iterates up tomax_iter.GRAPH_COMPLETION_DECOMPOSITION— decomposes the query into subqueries, runs graph completion per subquery, merges.GRAPH_COMPLETION_CONTEXT_EXTENSION— iteratively extends context: each round's completion is the next round's query, until no new triplets orcontext_extension_roundsis reached.
I find GRAPH_COMPLETION_CONTEXT_EXTENSION particularly interesting. The intuition is that a single retrieval round is a hypothesis; the LLM's first-pass answer is the next hypothesis; iterate until the graph stops yielding new information. This is closer to how a human researcher works — the question changes as the answer unfolds. The fact that the codebase ships it as a first-class search type, not as a hack on top of GRAPH_COMPLETION, is the architectural commitment to "retrieval is not a single round-trip."
The vector-only holdouts
RAG_COMPLETION, CHUNKS, CHUNKS_LEXICAL, and SUMMARIES are the search types for when the graph is the wrong shape. The simplest case is CHUNKS — vector similarity over text chunks, no LLM completion. This is what a user would build by hand in a notebook. Cognee ships it because sometimes that is the right answer, and because uniform contract means it costs nothing to keep it in the registry.
CHUNKS_LEXICAL is BM25-style lexical search over chunks, with quoted-string boosting. Useful for queries that include specific terms the user wants to find verbatim — error codes, product names, function signatures. The codebase acknowledges that semantic search is not always the right tool and provides a path for keyword queries.
SUMMARIES returns pre-computed document summaries. The summaries are generated during cognify() by extract_graph_and_summarize (the same task that does entity extraction). They are a coarse-grained view of the document, indexed in their own vector collection. SUMMARIES is the search type for "give me the gist, not the details."
The agentic retriever
AGENTIC_COMPLETION is a different beast. It is a ReAct-style agent loop, not a retriever. The retriever gets skills (named prompts the agent can invoke) and tools (callable functions). The agent reasons about which skill to use, calls the tool, observes the result, and continues until it has an answer. The execute_tool function in cognee/modules/retrieval/agentic_retriever.py:32 enforces per-dataset permissions the same way search does — the agent can only invoke tools the user has permission for. This is a real production constraint: an agent in a multi-tenant deployment cannot be allowed to read across datasets.
I think of AGENTIC_COMPLETION as the search type that escaped the search abstraction and became an agent. The fact that it still conforms to the three-method contract is a sign of the contract's robustness — the agent loop fits inside get_completion_from_context.
Permission filtering: silent on denial
A subtle but important detail: in cognee/api/v1/search/search.py:289-294, dataset names are translated to UUIDs the user has read permission on. The list of allowed datasets is passed to the retriever. The retriever returns results; the dispatcher filters by ACLs; if a result's dataset is not in the allowed list, it is silently dropped.
The "silent on denial" behavior is deliberate. The README's troubleshooting section notes: "Permission Denied on Search → returns empty list rather than error (prevents information leakage)." An attacker who can see "you don't have access to dataset X" can infer that dataset X exists and that they might have access to other datasets. Returning an empty list preserves the security property that denied information is indistinguishable from non-existent information.
The dispatcher, top to bottom
The dispatcher is cognee/modules/search/methods/search.py:40-150. Its flow:
1. log_query writes the query to the relational DB (search.py:79) 2. Emits an OpenTelemetry span cognee.search.authorize (search.py:90) 3. Calls authorized_search (search.py:99) which calls get_authorized_existing_datasets (search.py:182) to filter datasets by user read permission 4. search_in_datasets_context (search.py:213) wraps each dataset search in set_database_global_context_variables so each dataset runs against its own isolated graph/vector DB context 5. Per dataset, calls get_retriever_output (search.py:293) — the actual core loop 6. log_result records only the completion text (not the full result_object) to avoid unbounded DB growth (search.py:135-148)
The "log only the completion text" decision is one of those quiet choices that distinguishes production engineering from research code. The raw result_object can be 50–100 KB — in a long-running deployment, persisting every result would be untenable. The completion text is the only thing the user actually needs to see again; the result object can be reconstructed from the query and the dataset state.
The seventeen search types share the same dispatcher, the same contract, the same logging discipline, the same permission filtering. The seventeen values are not redundancy. They are the retrieval layer's coverage of the design space. The brute-force triplet search is the algorithm that won the BEAM benchmark. The other sixteen are the rest of the toolkit — there when you need them, free when you don't.