I'm currently working on a RAG project at my 9-to-5, testing different chunking and indexing strategies. Having small, overlapping chunks improved the specificity, like you said.
I started exploring MCP to gain more control over database-sourced content and built a simple MCP server prototype over the weekend. It was fun to see how the tiny qwen3:32b LLM model, running on Ollama locally, was able to figure out SQL queries from the database schema and my vague prompt. It even fixed the broken queries on its own, until it retrieved the data I requested. It took me a few hours to get this multi-step looping process, where the LLM self-corrects on error, to work correctly. Now my mind is racing as I think of ways to apply this idea to other problems. Self-improving learning loops would be like rocket fuel, accelerating problem-solving.
The ability for an LLM to call different tools using MCP (Model Context Protocol) opens up a new path, leveraging AI capabilities and other content types. I used the FastMCP library by Jeremiah Lowin - worth checking out.
I started using Cursor a few days ago and connected it to my Obsidian vault, and got a very similar experience to what you described.
Thanks for pointing out how RAG is embedded everywhere. I can see that, too, after building one myself. Looks like we are going through similar paths.
Your MCP experiment is fascinating, especially the way the LLM self-corrects broken SQL queries. That kind of looped reasoning gives the model a whole new level of agency. I hadn’t explored FastMCP yet, but now I definitely will. And the fact that you got qwen3:32b running locally on Ollama to deliver SQL results is seriously impressive.
I’ve only been using Cursor to retrieve data the way you described, and in my mind it already feels like a mature tool. But connecting it with a local LLM is definitely next level.
It’s also super affirming to hear you’re having such a similar experience, it really feels like we’re walking parallel paths.
Appreciate you sharing your build and reflections, it’s got my mind spinning on what else might be possible from here. 🚀
This is so good, Jenny. Amazing breakdown of how RAG works and how it’s already embedded in tools we use every day (even the ones that don’t advertise it).
Loved how you connected the tech with real use cases, especially the part on chunking strategy and how depth over breadth is what actually builds a functional second brain.
Saving this and planning to revisit your anti-juggling framework once I’m back from vacation.
I’ve been postponing myself to get my hands dirty with RAG. Now I know I have to!
I think the closest one I could try is to follow yours by building chatbot on my website that acts like my customer support agent. Can help my visitor to book call, ask services or any relevant information.
Thanks Wyndo! Yes, definitely a lot of fun to try this!
That’s such a smart use of RAG to turn your chatbot into a support agent for booking calls and answering questions! I’d love to see it when it’s live!
And funny to hear that you use Cursor for writing as well! I thought Claude desktop app was your favorite, but makes total sense since Cursor runs on Claude by default. :)
Great post Jenny!!! I was just thinking about this topic yesterday, so your post is perfect timing. I read every word of it, and I found it super helpful.
It still feels like we're in a super messy era where we have to use a ton of tools and patch things together rather than having our data easily all just be in one place.
I also loved your summarization at the end where you shared when to use different tools and when to use others.
Out of curiosity, I have two questions from you:
1. I was talking to someone yesterday, and they mentioned that there are different types of RAG databases, and he particularly recommended a knowledge graph RAG for what I wanted to do. Have you explored different types of databases?
2. Would it ever make sense to use something like Pinecone for someone to put in all of their second brain data?
Thanks so much for the kind words Michael!!! I totally agree, we’re in this strange, messy moment where every tool is both promising and partial. That chaos is frustrating, but the process of figuring it all out is also kind of exciting.
To your questions:
1. Yeah, there are different types of databases you can use with RAG.
The simplest kind is pure vector databases (like Pinecone) that’s just stored as embeddings, and return similar results.
Then there are hybrid ones that mix structured data (the values people traditionally store) with semantic search. Most companies probably adopt this type of database.
The third type knowledge graph database, which are more relationship-aware. So instead of just "what’s similar to this thing," they also ask, "what’s connected to this thing and how?"
If your data is big and complex, and you care about the relationships between ideas, that knowledge graph style might really help. But it’s also more work to set up and requires higher maintenance.
I really liked this example for deciding if you actually need a knowledge graph database:
Say you're building something for healthcare. A regular vector database will find stuff similar to your symptoms. But a knowledge graph can show you the whole chain of symptoms → diseases → treatments → outcomes. It gives you a deeper, connected context.
But if you don’t need those links, the knowledge graph db is probably overkill.
(And honestly, for personal use, Obsidian already does this pretty well. No need for a big cloud setup, unless you're building something for others.)
2. I agree that Pinecone totally makes sense for a second brain setup, especially with huge amount of content.
I personally didn't implement any of those. Right now I’m keeping it simple and just storing everything locally in JSON, keeping it cheap and fast because my dataset’s still small. But when that grows, I’d probably switch to something like Pinecone too. It’s easier maintenance and cheaper than the graph-based stuff and still super powerful.
Love the call to depth over breadth—constantly hopping between tools fragments your context and makes you start over each time. In my AI PingPong approach, I solve this by creating deliberate, context-preserving hand-offs between GPT, Grok, Gemini, and Claude. By planning with GPT, researching with Grok, validating with Gemini, and polishing with Claude in a tidy loop, you build on each step instead of losing ground. The result is a 20-minute production cycle that reliably hits >98% quality and feels more like a coherent “second brain” than a haphazard tool chain.
This resonates so much. Underneath all the tools and systems, what really matters is creating something that feels alive and sustainable something that actually supports us rather than fragments our attention further.
Your focus on reducing noise instead of adding more layers is refreshing. It reminds me that the real work is often in pausing long enough to ask, What do I truly need to hold, and what can I let go of?
Thank you for encouraging an approach that feels human first, tool second. It’s the kind of gentle discipline that makes all the difference over time.
Great take on the AI tool overload we all experience! Breaking down RAG as smarter, meaning-based search really clears things up. I love the idea of focusing on one tool that fits your thinking style instead of jumping around. Building context and depth over time makes so much sense — it’s how these tools truly become helpful, not just another distraction. The “anti-juggling” advice is simple but powerful. Definitely inspired to try this approach!
Scanned this and am blown away. I've been thinking of something similar -- not to this depth, but a directory of organized content that I could look use for retrieval and further brainstormng. This is honestly amazing, and I find some concepts I can use even if I won't get that deep into it. Saved for more in-depth reading later!
Thanks for your kind words, James! Honestly, after exploring so many different paths, I’ve found that just using a small portion in my day-to-day is more than enough :)
For sure. For the most part, I find that AI "power tools" are useful for specific situations, but for day-to-day tasks, I always default to the "smaller," more agile tools.
Fantastic article, Jenny!
I'm currently working on a RAG project at my 9-to-5, testing different chunking and indexing strategies. Having small, overlapping chunks improved the specificity, like you said.
I started exploring MCP to gain more control over database-sourced content and built a simple MCP server prototype over the weekend. It was fun to see how the tiny qwen3:32b LLM model, running on Ollama locally, was able to figure out SQL queries from the database schema and my vague prompt. It even fixed the broken queries on its own, until it retrieved the data I requested. It took me a few hours to get this multi-step looping process, where the LLM self-corrects on error, to work correctly. Now my mind is racing as I think of ways to apply this idea to other problems. Self-improving learning loops would be like rocket fuel, accelerating problem-solving.
The ability for an LLM to call different tools using MCP (Model Context Protocol) opens up a new path, leveraging AI capabilities and other content types. I used the FastMCP library by Jeremiah Lowin - worth checking out.
I started using Cursor a few days ago and connected it to my Obsidian vault, and got a very similar experience to what you described.
Thanks for pointing out how RAG is embedded everywhere. I can see that, too, after building one myself. Looks like we are going through similar paths.
Thank you so much for sharing this, Finn! 🙌
Your MCP experiment is fascinating, especially the way the LLM self-corrects broken SQL queries. That kind of looped reasoning gives the model a whole new level of agency. I hadn’t explored FastMCP yet, but now I definitely will. And the fact that you got qwen3:32b running locally on Ollama to deliver SQL results is seriously impressive.
I’ve only been using Cursor to retrieve data the way you described, and in my mind it already feels like a mature tool. But connecting it with a local LLM is definitely next level.
It’s also super affirming to hear you’re having such a similar experience, it really feels like we’re walking parallel paths.
Appreciate you sharing your build and reflections, it’s got my mind spinning on what else might be possible from here. 🚀
What a breakdown of RAG, Jenny! You definitely sparked some ideas I will be experimenting with. Thank you! As always, awesome work. 🙌🙌
Thank you Joel! Glad it sparked some thoughts for you, would love to see it when you have the experiment out 🙏🙌
This is so good, Jenny. Amazing breakdown of how RAG works and how it’s already embedded in tools we use every day (even the ones that don’t advertise it).
Loved how you connected the tech with real use cases, especially the part on chunking strategy and how depth over breadth is what actually builds a functional second brain.
Saving this and planning to revisit your anti-juggling framework once I’m back from vacation.
Thank you Daria! You are so spot on.
Have an amazing trip 🌴
Awesome post! I've also been working on a RAG for my substack content, how funny 😊
Haha that's funny, I'd say great minds think alike! 😉 🙌
And do share it when it goes public, I’d love to check it out!
Will do, thank you!
Amazing breakdown Jenny!
I’ve been postponing myself to get my hands dirty with RAG. Now I know I have to!
I think the closest one I could try is to follow yours by building chatbot on my website that acts like my customer support agent. Can help my visitor to book call, ask services or any relevant information.
Excited to try this!
Also I use Cursor for writing too, lol!
Thanks Wyndo! Yes, definitely a lot of fun to try this!
That’s such a smart use of RAG to turn your chatbot into a support agent for booking calls and answering questions! I’d love to see it when it’s live!
And funny to hear that you use Cursor for writing as well! I thought Claude desktop app was your favorite, but makes total sense since Cursor runs on Claude by default. :)
Great post Jenny!!! I was just thinking about this topic yesterday, so your post is perfect timing. I read every word of it, and I found it super helpful.
It still feels like we're in a super messy era where we have to use a ton of tools and patch things together rather than having our data easily all just be in one place.
I also loved your summarization at the end where you shared when to use different tools and when to use others.
Out of curiosity, I have two questions from you:
1. I was talking to someone yesterday, and they mentioned that there are different types of RAG databases, and he particularly recommended a knowledge graph RAG for what I wanted to do. Have you explored different types of databases?
2. Would it ever make sense to use something like Pinecone for someone to put in all of their second brain data?
Thanks so much for the kind words Michael!!! I totally agree, we’re in this strange, messy moment where every tool is both promising and partial. That chaos is frustrating, but the process of figuring it all out is also kind of exciting.
To your questions:
1. Yeah, there are different types of databases you can use with RAG.
The simplest kind is pure vector databases (like Pinecone) that’s just stored as embeddings, and return similar results.
Then there are hybrid ones that mix structured data (the values people traditionally store) with semantic search. Most companies probably adopt this type of database.
The third type knowledge graph database, which are more relationship-aware. So instead of just "what’s similar to this thing," they also ask, "what’s connected to this thing and how?"
If your data is big and complex, and you care about the relationships between ideas, that knowledge graph style might really help. But it’s also more work to set up and requires higher maintenance.
I really liked this example for deciding if you actually need a knowledge graph database:
Say you're building something for healthcare. A regular vector database will find stuff similar to your symptoms. But a knowledge graph can show you the whole chain of symptoms → diseases → treatments → outcomes. It gives you a deeper, connected context.
But if you don’t need those links, the knowledge graph db is probably overkill.
(And honestly, for personal use, Obsidian already does this pretty well. No need for a big cloud setup, unless you're building something for others.)
2. I agree that Pinecone totally makes sense for a second brain setup, especially with huge amount of content.
I personally didn't implement any of those. Right now I’m keeping it simple and just storing everything locally in JSON, keeping it cheap and fast because my dataset’s still small. But when that grows, I’d probably switch to something like Pinecone too. It’s easier maintenance and cheaper than the graph-based stuff and still super powerful.
The best tool for a second brain & to generate rather than receive or organize ideas is called an Analog Zettelkasten.
Love it! You are bringing back the origin!
Thanks for sharing
You are welcome :)
Love the call to depth over breadth—constantly hopping between tools fragments your context and makes you start over each time. In my AI PingPong approach, I solve this by creating deliberate, context-preserving hand-offs between GPT, Grok, Gemini, and Claude. By planning with GPT, researching with Grok, validating with Gemini, and polishing with Claude in a tidy loop, you build on each step instead of losing ground. The result is a 20-minute production cycle that reliably hits >98% quality and feels more like a coherent “second brain” than a haphazard tool chain.
https://trilogyai.substack.com/p/ai-ping-pong
Thanks for sharing your approach Stanislav! I’m sure it will greatly benefit the right people.
Love this series, @Jenny Ouyang. Keep going!
Thank you, Samara (hopefully I called you right)! I will :)
This resonates so much. Underneath all the tools and systems, what really matters is creating something that feels alive and sustainable something that actually supports us rather than fragments our attention further.
Your focus on reducing noise instead of adding more layers is refreshing. It reminds me that the real work is often in pausing long enough to ask, What do I truly need to hold, and what can I let go of?
Thank you for encouraging an approach that feels human first, tool second. It’s the kind of gentle discipline that makes all the difference over time.
Thank you, Benta! That’s exactly what I’m aiming for, really appreciate you seeing it.
Great take on the AI tool overload we all experience! Breaking down RAG as smarter, meaning-based search really clears things up. I love the idea of focusing on one tool that fits your thinking style instead of jumping around. Building context and depth over time makes so much sense — it’s how these tools truly become helpful, not just another distraction. The “anti-juggling” advice is simple but powerful. Definitely inspired to try this approach!
Thanks for the kind words, I’d love to hear what you end up doing with it!
Nice and helpful post !
This is the dream. The problem is, my first brain is already working three jobs, graphic designer, IT support, and full-time anxiety manager.
Building a second one feels like asking a guy who's already juggling chainsaws if he would also like to take up cycling.
Scanned this and am blown away. I've been thinking of something similar -- not to this depth, but a directory of organized content that I could look use for retrieval and further brainstormng. This is honestly amazing, and I find some concepts I can use even if I won't get that deep into it. Saved for more in-depth reading later!
Thanks for your kind words, James! Honestly, after exploring so many different paths, I’ve found that just using a small portion in my day-to-day is more than enough :)
For sure. For the most part, I find that AI "power tools" are useful for specific situations, but for day-to-day tasks, I always default to the "smaller," more agile tools.
Very detailed explanation about RAG. Just bookmarked this to read again when I start building my first RAG.
Thanks for reading and bookmarking it Luan! Really appreciate your finding this useful :)