ragfs_core/lib.rs
1//! # ragfs-core
2//!
3//! Core types and traits for the RAGFS (Retrieval-Augmented Generation `FileSystem`) project.
4//!
5//! This crate provides the foundational abstractions used throughout RAGFS:
6//!
7//! - **Content Extraction**: [`ContentExtractor`] trait for extracting text from files
8//! - **Document Chunking**: [`Chunker`] trait for splitting content into searchable chunks
9//! - **Embedding Generation**: [`Embedder`] trait for converting text to vector embeddings
10//! - **Vector Storage**: [`VectorStore`] trait for storing and searching embeddings
11//! - **Indexing Coordination**: [`Indexer`] trait for managing the indexing pipeline
12//!
13//! ## Architecture
14//!
15//! The crate is organized around a pipeline pattern:
16//!
17//! ```text
18//! File → ContentExtractor → Chunker → Embedder → VectorStore
19//! ↓
20//! SearchQuery → SearchResult
21//! ```
22//!
23//! ## Key Types
24//!
25//! | Type | Description |
26//! |------|-------------|
27//! | [`FileRecord`] | Metadata about an indexed file |
28//! | [`Chunk`] | A segment of content with its embedding |
29//! | [`ExtractedContent`] | Raw content extracted from a file |
30//! | [`SearchQuery`] | Parameters for a vector search |
31//! | [`SearchResult`] | A matching chunk with similarity score |
32//!
33//! ## Key Traits
34//!
35//! | Trait | Purpose |
36//! |-------|---------|
37//! | [`ContentExtractor`] | Extract text and metadata from files |
38//! | [`Chunker`] | Split extracted content into chunks |
39//! | [`Embedder`] | Generate vector embeddings |
40//! | [`VectorStore`] | Store and search vector embeddings |
41//! | [`Indexer`] | Coordinate the indexing pipeline |
42//!
43//! ## Example
44//!
45//! ```rust,ignore
46//! use ragfs_core::{ContentExtractor, Chunker, Embedder, VectorStore};
47//! use ragfs_core::{ChunkConfig, EmbeddingConfig, SearchQuery};
48//!
49//! // Components implement these traits
50//! async fn index_file(
51//! extractor: &impl ContentExtractor,
52//! chunker: &impl Chunker,
53//! embedder: &impl Embedder,
54//! store: &impl VectorStore,
55//! path: &Path,
56//! ) -> Result<(), Error> {
57//! // 1. Extract content
58//! let content = extractor.extract(path).await?;
59//!
60//! // 2. Chunk the content
61//! let chunks = chunker.chunk(&content, &ChunkConfig::default()).await?;
62//!
63//! // 3. Generate embeddings
64//! let texts: Vec<&str> = chunks.iter().map(|c| c.content.as_str()).collect();
65//! let embeddings = embedder.embed_text(&texts, &EmbeddingConfig::default()).await?;
66//!
67//! // 4. Store in vector database
68//! // ... create Chunk structs with embeddings and store
69//! Ok(())
70//! }
71//! ```
72//!
73//! ## Feature Flags
74//!
75//! This crate has no optional features.
76//!
77//! ## Related Crates
78//!
79//! - `ragfs-extract`: Content extraction implementations
80//! - `ragfs-chunker`: Chunking strategy implementations
81//! - `ragfs-embed`: Embedding generation with Candle
82//! - `ragfs-store`: `LanceDB` vector storage implementation
83//! - `ragfs-index`: Indexing pipeline coordination
84//! - `ragfs-query`: Query parsing and execution
85
86pub mod error;
87pub mod traits;
88pub mod types;
89
90pub use error::{ChunkError, EmbedError, Error, ExtractError, Result, StoreError};
91pub use traits::*;
92pub use types::*;