Unveiling the Power of Multimodal RAG - Dark Theme
Evangelism Workspace
Created on February 7, 2024
More creations to inspire you
ASTL
Presentation
ENGLISH IRREGULAR VERBS
Presentation
VISUAL COMMUNICATION AND STORYTELLING
Presentation
GROWTH MINDSET
Presentation
BLENDED LEARNING
Presentation
INTRO INNOVATE
Presentation
SUMMER ZINE 2018
Presentation
Transcript
Ryan Siegler - Data Scientist
Unveiling the Power of Multimodal RAG
KDB.AI Vector Database Introduction to RAG What is Multimodal data? Multimodal RAG
- Retrieval
- Generation
Agenda
KDB.AI
The vector database that enables the most relevant temporal and semantic search to power language models and anomaly detection at scale.
Start Your 90 Day Evaluation
Learn More
Start for Free
Integrated with Azure ML and OpenAI for developers who require turnkey technology stacks to speed up the process of building and deploying AI applications.
Evaluate large scale generative AI applications on-premises or on your own cloud provider.
- Single container deployment
- Scale to your requirements
- Customize to your dev environment
KDB.AI on Azure ML
KDB.AI Server
Experiment with smaller generative AI projects with a vector database in our cloud.
- 4 GB memory per instance
- 30 GB data storage
- Get started quickly with sample projects
KDB.AI Cloud
Get Started with Flexible Options
Architecture Walkthrough
- Ingest source data
- Embed source data using embedding model
- Vector Embeddings are stored in KDB.AI vector database
- Query the database to find relevant vectors
Understanding Vector Embeddings
Architecture Walkthrough
Retrieval Augmented Generation (RAG) enables LLMs to work on your own data. Two Key Steps:
- Retrieval
- Generation
Retrieval Augmented Generation (RAG)
Multimodal Data
Data Representation
Goal is to represent different data modalities in a unified manner within a vector database. Embed all data types in a shared vector space Use a single multimodal embedding model
Multimodal Embedding Model
Unified Text Embedding
Transform all data into text format Text embedding model used to embed text, image summaries, and audio transcriptions
Multimodal Embedding Unified Text
Architecture Walkthrough
Unified Text
- Summarize Images & Tables
- Use text embedding model
- Store text embeddings in vector database
- Embedding model can embed multiple modalities together
- Store embeddings in vector database
- Limitation: Few available models
- Either method can be used to retrieve embeddings
- Pass retrieved data to LLM to perform RAG
Multimodal Retrieval Methods
- User Query and Retrieved Data is passed to LLM for Generation
- Generation: What type of LLM should I use?
- Multimodal LLM: Can take multiple data types as inputs
- Text based LLM: Pass only text-based data like text chunks, summaries, and transcriptions to the LLM
RAG - Generation
THANK YOU
rsiegler@kx.com