apps/docs/concepts/super-rag.mdx


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176

---
title: "SuperRAG (Managed RAG as a service)"
sidebarTitle: "SuperRAG"
description: "Supermemory provides a managed RAG solution - extraction, indexing, storing, and retrieval."
icon: "bolt"
---

Supermemory doesn't just store your content—it transforms it into optimized, searchable knowledge. Every upload goes through an intelligent pipeline that extracts, chunks, and indexes content in the ideal way for its type.

## Automatic Content Intelligence

When you add content, Supermemory:

1. **Detects the content type** — PDF, code, markdown, images, video, etc.
2. **Extracts content optimally** — Uses type-specific extraction (OCR for images, transcription for audio)
3. **Chunks intelligently** — Applies the right chunking strategy for the content type
4. **Generates embeddings** — Creates vector representations for semantic search
5. **Builds relationships** — Connects new knowledge to existing memories

```typescript
// Just add content — Supermemory handles the rest
await client.add({
  content: pdfBase64,
  contentType: "pdf",
  title: "Technical Documentation"
});
```

No chunking strategies to configure. No embedding models to choose. It just works.

---

## Smart Chunking by Content Type

Different content types need different chunking strategies. Supermemory applies the optimal approach automatically:

### Documents (PDF, DOCX)

PDFs and documents are chunked by **semantic sections** — headers, paragraphs, and logical boundaries. This preserves context better than arbitrary character splits.

```
├── Executive Summary (chunk 1)
├── Introduction (chunk 2)
├── Section 1: Architecture
│   ├── Overview (chunk 3)
│   └── Components (chunk 4)
└── Conclusion (chunk 5)
```

### Code

Code is chunked using [code-chunk](https://github.com/supermemoryai/code-chunk), our open-source library that understands AST (Abstract Syntax Tree) boundaries:

- Functions and methods stay intact
- Classes are chunked by method
- Import statements grouped separately
- Comments attached to their code blocks

```typescript
// A 500-line file becomes meaningful chunks:
// - Imports + type definitions
// - Each function as a separate chunk
// - Class methods individually indexed
```

This means searching for "authentication middleware" finds the actual function, not a random slice of code.

### Web Pages

URLs are fetched, cleaned of navigation/ads, and chunked by article structure — headings, paragraphs, lists.

### Markdown

Chunked by heading hierarchy, preserving the document structure.

See [Content Types](/concepts/content-types) for the full list of supported formats.

---

## Hybrid Memory + RAG

Supermemory combines the best of both approaches in every search:

<CardGroup cols={2}>
  <Card title="Traditional RAG" icon="magnifying-glass">
    - Finds similar document chunks
    - Great for knowledge retrieval
    - Stateless — same results for everyone
  </Card>

  <Card title="Memory System" icon="brain">
    - Extracts and tracks user facts
    - Understands temporal context
    - Personalizes results per user
  </Card>
</CardGroup>

With `searchMode: "hybrid"` (the default), you get both:

```typescript
const results = await client.search({
  q: "how do I deploy the app?",
  containerTag: "user_123",
  searchMode: "hybrid"
});

// Returns:
// - Deployment docs from your knowledge base (RAG)
// - User's previous deployment preferences (Memory)
// - Their specific environment configs (Memory)
```

---

## Search Optimization

Two flags give you fine-grained control over result quality:

### Reranking

Re-scores results using a cross-encoder model for better relevance:

```typescript
const results = await client.search({
  q: "complex technical question",
  rerank: true  // +~100ms, significantly better ranking
});
```

**When to use:** Complex queries, technical documentation, when precision matters more than speed.

### Query Rewriting

Expands your query to capture more relevant results:

```typescript
const results = await client.search({
  q: "how to auth",
  rewriteQuery: true  // Expands to "authentication login oauth jwt..."
});
```

**When to use:** Short queries, user-facing search, when recall matters.

---

## Why It's "Super"

| Traditional RAG | SUPER RAG |
|-----------------|-----------|
| Manual chunking config | Automatic per content type |
| One-size-fits-all splits | AST-aware code chunking |
| Just document retrieval | Hybrid memory + documents |
| Static embeddings | Relationship-aware graph |
| Generic search | Rerank + query rewriting |

You focus on building your product. Supermemory handles the RAG complexity.

---

## Next Steps

<CardGroup cols={2}>
  <Card title="Content Types" icon="file-stack" href="/concepts/content-types">
    All supported formats and how they're processed
  </Card>
  <Card title="How It Works" icon="cpu" href="/concepts/how-it-works">
    The full processing pipeline
  </Card>
  <Card title="Memory vs RAG" icon="scale" href="/concepts/memory-vs-rag">
    When to use each approach
  </Card>
  <Card title="Search" icon="search" href="/search">
    Search parameters and optimization
  </Card>
</CardGroup>