Usage

Installation

To use Rakam Systems, you can install the pypi library using pip:

(.venv) $ pip install rakam-systems

This will install the library and its necessary dependencies such as faiss, sentence-transformers, and others listed in the README.

Creating Vector Stores

To create and manage vector stores, you can use the VectorStores class from the rakam_systems.vector_store module.

Example:

from rakam_systems.vector_store import VectorStores
from rakam_systems.core import VSFile, Node, NodeMetadata

# Initialize a Vector Store
vector_store = VectorStores(base_index_path="path/to/index", embedding_model="sentence-transformers/all-MiniLM-L6-v2")

# Create vector store from nodes
nodes = [
    Node(content="Text data 1", metadata=NodeMetadata(source_file_uuid="file1", position=1)),
    Node(content="Text data 2", metadata=NodeMetadata(source_file_uuid="file2", position=2))
]
vector_store.create_from_nodes("store_name", nodes)

Retrieval-Augmented Generation (RAG)

RAG enables the combination of vector store search with LLM-based prompt generation to produce context-enriched responses.

Example:

from rakam_systems.generation.agents import RAGGeneration, Agent
from rakam_systems.vector_store import VectorStores

# Initialize Vector Store and Agent
vector_store = VectorStores(base_index_path="path/to/index", embedding_model="sentence-transformers/all-MiniLM-L6-v2")
agent = Agent(model="gpt-3.5-turbo", api_key="your_openai_api_key")

# Create RAG Action
rag_action = RAGGeneration(agent, sys_prompt="System Prompt", prompt="User Prompt", vector_stores=vector_store)

# Execute RAG Action
query = "What is the capital of France?"
result = rag_action.execute(query=query)
print(result)

Content Extraction

This library provides several content extractors, such as extracting from PDFs or JSON files.

Example for extracting from a PDF:

from rakam_systems.ingestion.content_extractors import PDFContentExtractor

# Initialize PDF Content Extractor
pdf_extractor = PDFContentExtractor(parser_name="SimplePDFParser", output_format="markdown")

# Extract content from a PDF file
vs_files = pdf_extractor.extract_content(source="path/to/file.pdf")

Node Processing

To split content into smaller chunks (nodes), you can use the NodeProcessor classes.

Example for processing nodes:

from rakam_systems.ingestion.node_processors import CharacterSplitter

# Initialize Node Processor
splitter = CharacterSplitter(max_characters=512, overlap=50)

# Process Nodes
splitter.process(vs_file)

Classification with Vector Stores

Use the vector store to classify queries based on predefined trigger queries.

Example:

from rakam_systems.generation.agents import ClassifyQuery
import pandas as pd

# Sample Data for Classification
trigger_queries = pd.Series(["What is the capital of", "Tell me about"])
class_names = pd.Series(["Geography", "General Info"])

# Initialize Classification Action
classifier = ClassifyQuery(agent=None, trigger_queries=trigger_queries, class_names=class_names)

# Classify a new query
result = classifier.execute("What is the capital of France?")
print(result)