🎞️ Videos My RAG AI Isn’t That Hard to Build!

Description

Explore the world of Retrieval Augmented Generation (RAG) AI and its application in building smarter search systems. Jirachai Chansivanon, a Consulting Engineer at MongoDB, demonstrates how to create a RAG AI system using MongoDB and vector search to enable natural language queries. Using the example of searching an anime database, he guides you through the process of data ingestion, embedding creation, and search implementation. Learn how MongoDB Atlas's vector search capabilities simplify the development of applications that understand and respond to complex human language, making information retrieval more intuitive and efficient. Discover how to leverage these techniques to build your own RAG AI applications.

Chapters

  • Introduction and Anime Icebreaker 0:00
  • Introducing RAG AI and Project Goal 1:20
  • What is RAG AI? 2:27
  • Speaker Introduction and Background 2:35
  • RAG AI Explained: Retrieval, Augmentation, Generation 3:39
  • RAG AI Beyond Traditional Databases 6:12
  • Visual Explanation of RAG AI Workflow 7:40
  • Recap and Deep Dive into Vector Databases 9:00
  • Demo Setup: Anime Search Application 11:31
  • Setting up MongoDB Atlas Vector Search 14:18
  • Live Demo: Searching Anime with Natural Language 17:33
  • Data Ingestion and Vectorization Process 23:42
  • Q&A: Model Consistency and Re-embedding 25:56
  • Q&A: Choosing the Right Embedding Model 27:17
  • Live Demo: Building a New RAG AI Project from Scratch 28:24
  • Live Demo: Implementing Search Functionality 36:52
  • Wrap-up and Next Steps: GitHub Repo, Atlas CLI, and Local Deployment 40:01

Transcript

These community-maintained transcripts may contain inaccuracies. Please submit any corrections on GitHub.

Introduction and Anime Icebreaker0:00

Okay.

Ah, can I?

Okay.

Ah, chotto matte.

I'm so excited.

Alright. Right. So good evening, everyone. Welcome to my session. First of all, I have some questions for you guys. Do you know this anime or these cartoons? So?

Ah. Yeah. So, anyone know this one? You know this one, right? Anyone know this one? No. No. No, it's Okay. So actually, it's quite old. I just realized these things you know, like a couple days ago when I asked some new generation students, and they said, "I don't know this one. What is it?" So actually these cartoons are named In Thai it's น้องสาวผมไม่ได้น่ารักขนาดนั้นหรอก In Japanese it's Ore no Imōto ga Konnani Kawaii

Wake ga Nai. So you maybe like Why is it related to something that we're going to say today?

Introducing RAG AI and Project Goal1:20

So today, I'm going to share this thing. to any things because

I'm going to talk about how we create the RAG AI or RagAI. So what do you like, by the way? Yup. Yup. So, because I'm going to base the story from these cartoons. So, these cartoons, the main characters, the middle one, the girl, she is like

very good girl, good student, and really high class, something like that. But actually, behind the scene, in her secret life, she loves anime, playing eroge games, yep, some 18 plus game, something like that. And I think what I could help her to achieve something better or So today, I'm going to create the RAG AI

that will help Kirino to search anime easier. So, but before we go do that,

What is RAG AI?2:27

I would like to talk about RAG AI first. What's it? So...

Speaker Introduction and Background2:35

Oh, first of all, I forgot to introduce myself. But anyway, My name is Jirachai Chansivanon. My nickname is Job. I'm currently working at MongoDB in Singapore. So my role is Consulting Engineer, so normally I'm going to work on how we can make MongoDB faster, how we can install it, how we are going to implement it, something like that. So that's what my work is. And previously I was working at Microsoft as Digital Specialist,

so you may get confused, what is it? So actually it's normally about being specialized in some Microsoft technology areas. Previously I was

working as a specialist for Azure, about application things in Azure. How are we going to install it? How to build it? Something like that. But yeah, just the past. So, now today I'm at MongoDB and

I found something really fantastic here as well. So, let's move on.

RAG AI Explained: Retrieval, Augmentation, Generation3:39

So, let's start from RAG Anyone know about RAG before, or know what it stands for? I guess most of you all also know about this one. So, basically, RAG stands for Retrieval Augmentation Generation. So, it's going to be one technique to build your applications and connect to LLM. Large Language Model, like GPT, Gemini, or DeepSeek or something else, and let them find or search something for you, but based on human language, like not directly with something like that. You just, for example, need to search for sweaters. So if you say,

I'm cold right now, so I'm looking for something to wear to make my body warmer, something like that. You can type in something like this, and your AI can search for a sweater or clothes for you. So that will work for you, but

how with So normally, if your applications, right? You have your database here. So normally, you interact with your application by searching or browsing something like that. And then your application just searches right away from the database and returns it to you. That's what a normal application does, right? But for RAG,

normally, it will retrieve the data the same as the database does. I mean, like a normal application does. But maybe add something more in this step.

If we add augment and generations, what is it? So augment means we're going to use the capability of LLM. It could be your Gemini, or yeah, whatsoever, et cetera, to augment, to combine with the base knowledge. Like, if you find some part of the data, and you need to add on more to make it more complete. So you can use LLMs to augment it to make it more complete. And then, generate it back into the human language. So you may probably see something like this with new applications nowadays. Today, you can use ChatGPT and try to search something. And ChatGPT just goes to searching from Google or Bing, right? It's Bing actually. So yeah, so searching for you and generating an option for you. needs and that's your in the human language. to easier for easier way for you. So that's what RAG is going to do, but

RAG AI Beyond Traditional Databases6:12

actually RAG does not just only support for normal database, Where is my next slide?

Okay. So, actually, one thing that makes RAG

more special than the normal searching is it could search from the data that that could be PDF, JSON file, images, or something else, that you can convert back into the numeric representations, like vectors. So that's what we're going to do with RAG today, because normally, in those part, it can be on application part, right? But we're going to be talking about how how are we going to convert our data and store it in something else and make your LLM process easier, because LLM doesn't just go and grab some data and come back. But it needs to search for something that is similar to something that you search, right? So for example you need to be looking for some sweaters or something to make you feel warmer, right? So, you cannot search for warm clothes or something like that from the database, but you need to search for something and then try to find something similar as to your input.

So we're going to be talking about how do we make this process happen all realistic things. because LLM will not search like state-of-word but search in some way.

Visual Explanation of RAG AI Workflow7:40

So let me switch to this computer instead. I will draw something for you. Well, I need to move this one.

You may be thinking about why I don't use just only one computer. So actually, during the build of this demo, I forgot to switch the computer. I mean, I picked the wrong one. Yeah, so that's why it ended up like this. So actually the demo should be on this computer, but in that moment, I picked the wrong one. Well, okay. Let me Share my screen. Well, that's my debug. Okay. So let's start from here.

Let me keep what I have to say to see where my applications are for drawing. for drawing

Give me a second.

All right, it's here.

Let me re-share my screen.

All right. Okay. Let me recap what I have just

Recap and Deep Dive into Vector Databases9:00

explained previously. So, normally, right, you have an application, and your application needs to search something from the database, right? Normally, if we need to search, it will go to the search engine instead, right? and just search something that uses the keyword, right? But we will not do this one. But we can try to search something as a sentence, right? And we're going to use LLMs to convert this one into the numeric representations. Actually, it should be like ones and zeros.

And then, try to search based on these numbers. So why do we need to convert our sentence or search to numbers? Because numbers could represent something that is wider or more dynamic. So, because

after we convert this one into numbers, we will search the data based on a number as well. That means the data that we're going to store in the database is also stored as a number as well. So when we are going to store this kind of data, you need to look for some database that we call vector database. And then we can use our search engine to search it on the vectors. This is what we are going to do, and today we will not focus on this part or other part to part

about generation, but we will focus on this part, how we're going to import the data, and how we are going to search our data. And today, our components, of course, this database is MongoDB, because MongoDB is designed to store the JSON format data, right? And it's also optimized for vectors database as well. So that means you can store your data like the original data along with the vector data as well. And you can use our search engine. We call it Atlas vector search

to search our data. So actually, normally, this process is quite complicated, but MongoDB makes this thing easier, but I'm not sure it's really as easy as I'm not sure you're going to think the same as me, but this process is much easier than previously as well. So, let's see. What do we need to do? I will Okay.

Demo Setup: Anime Search Application11:31

Okay. So, I will refer to my demonstrations. In this system, I'm going to build the application that helps users to search anime easier based on human language, right? So, that means we need to have the database of anime first. So, the first thing I need to do is Um, to load the database. Load the data somewhere else to MongoDB. So, this part I chose to get this one from my anime list web to get the list of animes. And then, I do one thing. I convert it to numbers. So, to convert this one to numbers, how to do this? I use the LLMs. But, this LLM is an embedding model. Basically, an embedding model is another language model, but it like language model, but it specializes in converting your text into numbers. And let me send it like different dimension. It's on its info. So, that means if you have, like, the back or like synopsis of that anime, it will convert the whole story into the numeric representations. So, in this part, I'm using a REST API to get the whole story of my anime

and ask my LLMs to convert this one into the numbers, and then just insert the numbers along with the full document into MongoDB. This is the first step. So that means after I insert, I will get something like this. This is it. Is here? Oh, no. This one. So after I insert my data, as a number numeric representations along with my full data. So of course, here is my original data, right? But here, I have the semantic embeddings. This is what I convert by using LLMs to convert the whole text of this one of semantics. semantics and then convert into the number like this one. So that means, this data is ready to search by using vector search. So, you can think about like So, how are we going to use vector search? Do I need to create a new vector search engine, or do I need to install something else? Luckily, in MongoDB, we already embedded the search vector search engine and text search engine for you. But as of now, that engine is only available on cloud versions, but I'm hoping this year they will enable you to install it on your local machines.

Setting up MongoDB Atlas Vector Search14:18

So today,

I installed the cloud version of MongoDB on my computer by using development tools. So, that means I don't need to connect to the internet at all. So, I just create my search index here.

Before my session, I created a search index

to tell my MongoDB vector search to know which field is the vectors

types. So I just create my search index in MongoDB directly, and then just set the type into vectors, and then set the path. So path, it means which field that we are going to use as a vector search. So from here, I just set I need to use the Semantic I will, you know, I need to copy the whole of the text. Yep, yeah. So to do so, in MongoDB Atlas, after you create MongoDB Atlas, and it's free, you just create your search index. and set the vector search like this. just choose the path to be the path that you store the vectors at. And then, you just set your Dimensions, number of dimensions. So how can we find this one? It's based on your LLMs or LLM, whichever one that you use. So for me, I use I forgot which one I used for this section. I used the

Oh, Nomic Embed embed text. So this, the number of dimensions is based, it depends on your models that you choose. So that means my Nomic models, they have 768 dimensions. So here, I just put this one to be 768 dimensions. To find this one, you can check on your model documentation, they would tell you how many dimensions it will return back to you. And next, it's similarities. So, similarity basically is going to be like the function in mathematics will help us to search or match something that is similar together. So here we have three choices of similarity. We have Euclidean, cosine, and dot products. So, you may like Okay. Now I have to extend the mathematics class. Actually, yeah, if you need to deep understanding of it. But for me today, I'm going to use cosine. Because cosine is good for finding something that is similar or close together in terms of distance. So, I will use the cosine. So, to understand the other function, you can check on the MongoDB documentation as well.

So, I just set this one as cosine.

And then,

change the index name to be something different, like this. And now it's So now, your database will be available for you to set as a vector search. and text search.

Live Demo: Searching Anime with Natural Language17:33

So, to search this one, we just type your sentence. But your sentence, as of now, is still a string, right? as human text. So that means we need to convert this one into human text, okay? And as well So, if you go to my GitHub, I will Oops.

It's gone. My GitHub is gone. Let me check. What happened? Hello.

Hey. Anyone here? Riffy? Is that back?

So, yeah. So that Screen is up. Okay, now it's back. Just like, yeah. I don't need this doc, actually. Yep. So, actually, you can check my source code from here and you can go along with me. But Yeah, as of now, my my source code that I pushed to my GitHub is not finished yet. I will put it up later. Yeah, I just finished this one when I came here on stage. So, yep.

So, when I go to my repository, you're gonna notice that it's not completed. But the whole thing is inside here in the anime pullers. So, of course, I created this folder to ingest my data into MongoDB. But

Yep. As of now, I use this one for demonstration, as well. So, after I convert my data and insert and create the index for vector search, right? So I just need to create the search system. So in my file, you will notice that I have search.ts here.

So

I hope this file will explain how MongoDB search to you. I mean like

easier to understand. So here, in this process, what I have to do, I just receive retrieve the input from the users. And then I just convert this one to be the embedded representation as numeric representations. And then it just search straightforwardly into MongoDB like this. So here, this is my input. And then it will search for from this collection, anime list. And it will try to find like 500 candidates that could have something similar to what I am searching for. and return up to 100 search results back to me. And it will return the

results of searching. I will show you. Anyone here would like to search some cartoon? You have some cartoom that you'd like to search? Dragon Ball. Dragon Ball, right? Okay. Let me try to search. So, instead of searching Dragon Ball, directly, I use search based on its semantics. Like, let me try.

Give me a moment. So, I just

I just need to say it like this. Okay. So I just put what, I need to search for. I will search for like story about

How do you, how do we call Goku?

I mean like, the type of the super human, how do we say, A Super Saiyan. Okay. The story about Super Saiyan. How do you spell Saiyan? S E Y, right? S A I S A I. Oh.

A N Y A, right? Y A N. Y A N. Saiyans. To collect

seven balls.

Okay. Huh.

Dragon Ball's awesome.

And this one.

Okay. I will search with a small like this. Okay, let's try to see. Gonna think, why? Because I removed this one. Yeah, I call this one. Yep, I was there.

Okay, that is return. Like Dragon Ball. Like this. Okay.

I didn't include the Dragon in my search at all. But I was just talking about something like Super Saiyan, right? Or I would like to try something different, like study about

Do you have some idea about One Piece, right? Oh, I think Luffy is maybe quite too straightforward, so I'm gonna try like Pirate.

I will accidentally misspell as well. Pirate to find the precious

I never searched about this one before. But let's see what I got from here.

I know about One Piece at all. Oh, it's here. Yeah, it may be not a top Top results, but it's also included here as well. So this is what I can do from here.

Data Ingestion and Vectorization Process23:42

Before I end my session, I want to show you how to ingest my data and convert into vector representations. So you can go back, you can open my start producer actually. So because I just do this demonstration this afternoon, so that means I need to ingest like a thousand of anime into my computer directly. So instead of just using a for loop, I use Kafka instead. So from here, I'm just pulling the data from my anime list, right? So if you go to this file, get anime.

I just get all anime and keep looping until keep going to the next page and next page. And then I will send to the next step to embed my data. So after I got a bunch of data already, I just do the same thing as I have done with searching. So I just convert it back to be like I just convert it into the vector search. into vector representations. I just insert it directly into my database.

And so that, it's done. So, in this process, you may not be able to see the search UI at all because this one it will be on the back end part. So, it's going to be about how we are going to convert my data, how we're going to send it. So, Yep, that's about the process. But from here you may see the other files as well because we see essentially inserted the wrong data, and I also included the adults anime here as well. So, I need to include it later on. Yeah, so it's quite complex here. I mean, the way I represent represent the source code here. But I hope you can understand and get some idea about how the vector search works here.

Q&A: Model Consistency and Re-embedding25:56

Well, I will open the floor for you to ask me about the process or need me to clarify something here. Anyone that would like to ask me about this one?

Yes

Okay, let me ขอบคุณค่ะ Yes ครับ เอ่อ แบบว่า for

when you embed the data of to store in the database, does it Does it need to be the same model that you use to search, or if we decide to change the model, do we have to Redo the database? Awesome. That's a very good question. I also have the same question as you did before. So, yes, of course, you need to use the same model to search. The same model as you store your data to search as well. And if you would like to change the model, yes, that's right. You need to re-embed once again.

Like what I have done here. So you will see that I have one file named re-embed it. So, yes, because I changed my mind to change my data structure that I just embedded. So that means I need to start over again to ingest my data. That's right. Okay. Thank you so much for your question, by the way. Any questions?

Q&A: Choosing the Right Embedding Model27:17

Yes.

Yeah, so I think many of us might not

know much about text embedding or something like that.

If there any advice on how to choose it, which which embedding model is suitable for what kind of search? Okay. Yeah. Thank you so much. So, That's also a good question, as well. So, actually, the model, you need to do some benchmarking as well, testing with your dataset, or maybe you need to read through the documentation because it depends on what kind of dataset you have. What are your data languages? So you need to check on that. For me, I used to use Nomix because it's good at both Thai and English as well. And as I checked on benchmarking, it

has a good result, and in terms of performance to execute on my computer, it doesn't use much time to convert it. Yeah. That's right. Thank you so much for your question.

Live Demo: Building a New RAG AI Project from Scratch28:24

Because we still have time, right? I will try to start over five minutes, right? Oh, Okay. I will try. So actually, I could try to create the whole new projects. And of course, this demo could fail, so let's try.

So, I just would like to show you some examples. If you go back to your home today, what can you do? First of all, just go to MongoDB MongoDB Atlas. Oops. Yep. My webpage is not finished yet. So, first of all, just go to your MongoDB Atlas.

Give me a second. Why do we need to create a MongoDB first? Because we need to have the the database to store the data, right? So just open your Atlas. MongoDB Atlas and create your first cluster. Actually, MongoDB we have always free and forever free tiers without using your credit card at all. Just create it straightforwardly. And then, you just you got your cluster like this. And then, and then we need to insert our data. As of now, we already got our cluster. Next step, we just need the connection string.

So, from here you can choose any. So I will choose maybe Shell.

So you can try to connect to your database as well.

And I need to change my usernames.

So just need to check that my connection is valid. So just connect to your database.

and it's in, like… It's not work. Oh, it worked. Awesome. So, that means, Right now, we already got our, first cluster. Next step, you need to ingest your data. So, to ingest your data, you need to get ingest as a vector search, right? So, I will try to

insert some random data. I will try to create a new file here. Let me create demo.ts, like this. And then I will try to insert the story of myself and my friends.

So, I will talk and insert my names and about myself into the data. So, I will start from me.

names, like, Job.

And I will put my description here.

So, or is…

But that, this is not the name. So, I will use nickname instead. So, description will be the man

who

working at

MongoDB,

lonely in Singapore. Yeah. I need to attack myself first. So sad, no? And my friends usually

And my friend is him, actually.

visit me. At 2:00 a.m. randomly. Yep, I put it like this. And then I will try to add another data. I will pick my friends, maybe some You, I will write about it. about you. So, So this is my friend's name.

and descriptions What should I describe about you? Hmm.

Okay. Maybe like

A man

who was

studying abroad. studying abroad who was, who was a student

in Japan. Something like that.

So, actually, it's quite small data, actually. So, But I will try to show you how vector search works here. So, after you already got your data, you need to convert it, right? You just build your first, your first converters. So, from here, on my computer, I already installed Ollama, the application that allows you to run your LM locally. So, I already

installed the package to connect to Ollama directly, like this. This is my function to convert my data. So, I will create other functions. Oh. มีที่วางไมค์ไหม?

Okay. I will create one function here, like main, right? And then, I I will convert my data. So, To convert the data is very simple. You just loop it. And then, just send it directly.

vector.

And I will call this one.

The co-pilot generated for me. So, we need to. JSON stringified first. And send it just only like the descriptions, I guess.

And then, you just put. The data structure here. Give me a moment.

Next step, you just insert your data. So, you need to connect to MongoDB first. So, I will use my function that I already created. Connect DB. And it needs to be awaited. And then. I will insert it.

Collection. Thank you, co-pilot. And I just insert one.

Like this. And of course, we need to include the full document or original document as well. So, we can use the object spread to. Include all original data. And then include the embedding here. Right? And done. This is the first part of inserting the data. So, let's see what it can look like.

I need to run the demo file.

Oh. Main functions. Mhm.

Oh, yeah, that's right. Thank you so much. Yep, it should be on this name. Yeah, nickname. All right. And yes.

And hopefully it Done. Okay. I think it's done because I didn't pin out anything. Oh, And you can see my password live away from here. Awesome. Yep. It's fine. So, this one, I just insert into my my local Atlas actually. So, For you, you just use the normal Atlas, like the descriptions. This one just insert to my local Atlas here. And when you see that, we've got a new We should have a new database here, no? Why? It's not up here. this fake database. Oh, it's here. Okay. I got a new collection here. Now, you'll see that my data, my descriptions already convert to vectors, right?

Live Demo: Implementing Search Functionality36:52

Next step, we need to search. So to search, we just view this is So I will create another function here for search.

I will refer to my previous code source code here. Async functions search And I will push q?: string Here And of course, we need to retrieve the input. Form Q or like form parameters like this. Completing. Thank you, Copilot. You helped me a lot. So, Yep. And then, we just type send our input into generate embedding. Like this. And then, What next? Just connect your MongoDB. And I think I can copy and paste.

Okay. So, we just We just use

the aggregation function to people collection. as a return result. And don't forget to change your path as well. So now, for our path, we use embedding, right? Yep. We use embedding. This one.

And one thing that we forgot, We need to create an index for vectors as well.

We need to come back here. And click create search index.

And we just put our path for embeddings, and our model size is 768.

And we're going to use the cosine similarity functions.

Yep. Okay. Now, it's ready. Now, everything looks already good. I hide I will I will hide everything as well. I don't need to show it. and set it to be an array. And console log the results Okay, I will show only the nickname. and I will remove the descriptions

Alright. Yes. the Moment of Truths

Why error here? Let's see.

What am I missing here?

What am I missing here?

I am missing parentheses here. and change this one to be search.

Okay. And let's try to run this one. So I will type Japanese students.

I think that's too straightforward. Maybe like Study abroad, studying abroad.

Let's see the result. Yep. We got some first, and followed by job. Yay! Okay. Yay!

Wrap-up and Next Steps: GitHub Repo, Atlas CLI, and Local Deployment40:01

Time is running out. Sorry. Thank you so much. That's all. It's Riffy's idea, not mine. Okay. Say bye to Singapore. Bye bye. Thank you. Do you want to say any last words? No. I'm just saying try to use this. Yep. You can come back to check on my repository and try to deploy this one with your own MongoDB cluster. I mean, MongoDB Atlas cluster. And if you would like to deploy your own MongoDB Atlas on your own machine, you just install this thing. You need to install something we call Atlas CLI.

Once you install the Atlas CLI, you can deploy exactly the same Atlas MongoDB Atlas cloud version on your computers. features using Atlas followed by deployment

and setup. If I remember correctly, setup type

Atlas local like this one. I'm not sure. You need to install two things: Atlas CLI and Docker. And after that, it will deploy it will deploy something that's similar to MongoDB Atlas cloud versions. I forgot the parameters. So that means you can try to run your own MongoDB Atlas on your computer directly. And it will create a new MongoDB containers, Docker container with containing the MongoDB DB Atlas inside, and you can follow the steps. And you will get your own MongoDB Atlas on your computers, and enjoy. And happy hacking. Thank you so much. Thank you. Thank you very much. Okay.

Okay.