🎞️ Videos → Making a Machine Readable Me
Description
จะเกิดอะไรขึ้นถ้า AI สามารถเข้าใจ "ตัวตน" ของเราได้ทะลุปรุโปร่ง? 🤖🧠 ร่วมสำรวจแนวคิดสุดล้ำในการสร้าง "Digital Twin" และการแปลงบริบทของมนุษย์ให้กลายเป็นข้อมูลที่ AI อ่านและเข้าใจได้ กับ John F.X. Berns สู่ก้าวใหม่ของการใช้ AI ให้ตอบโจทย์แบบ Personalized สุดๆ
Chapters
- Personalized AI requires balancing useful context with data privacy 0:00
- Seek a middle ground between useless anonymity and dangerous full exposure 1:38
- Tune the strictness of PII masking with tools like Microsoft Presidio 2:43
- Leverage privacy laws like GDPR to request deep social media data exports 3:57
- Raw social media data exports are often scattered and unreadable 5:34
- Use AI to organize raw data into relationship graphs before scrubbing it offline 7:02
- Summary: Download, organize, offline scrub, and securely feed your AI 9:23
Transcript
These community-maintained transcripts may contain inaccuracies. Select any text to report issues instantly, or edit on GitHub for advanced changes.
Personalized AI requires balancing useful context with data privacy0:00
So, I'm going to...
you don't need to see all that. So, I'm like a lot of the people here
that are talking more about the security, how to get a secure setup to use OpenClaw.
I used to be chief data officer here at Central Group, before that I ran data at Lazada. I've been involved with Lazada data for a long time. I know what can go bad when people get access to data. So, I'm very conservative about it. That being said, I really really really want to get in and have my agent know who I am and understand me when it comes to making decisions.
So, it puts you in kind of a precarious position.
You want to give it information, but you don't want it to know too much.
I got too many windows.
Jesus Christ.
He's not here right now. Has he ever been here?
Seek a middle ground between useless anonymity and dangerous full exposure1:38
So, at one end of the spectrum, you've got things that are complete anonymizers, right? You can take data, you can make it so that nobody understands who the person is. You can know a little bit about them, but if you want to make a graph of people, their connections, and understand the connections between people, that doesn't help you. On the other end, it's the free-for-all. You give them access to your chat, you give them access to your Gmail, you give them access to your files.
That's a tragedy waiting to happen. I think most of these guys probably gave you lots of reasons why that's not a good idea. So, I started to think there's got to be some middle ground
that we can live in, where we can have some sense of security, some sense of anonymity,
and without having to give away all of our secrets. So, I started playing with some tools.
Tune the strictness of PII masking with tools like Microsoft Presidio2:43
And I've been working with Microsoft Presidio.
It's a Microsoft tool for anonymizing PII. Personally identifiable information.
Yeah, PII. And you can set the settings very strict,
so you have no idea who the person is. But you can also back it off a little bit, so you can expose some information. And that becomes very useful. So, what I want to do, ideally, is to be able to create a graph of people
without exposing their sensitive information to the rest of the world. I don't want to give away your email address, because somebody might call you up with a fake voice agent and ask money from your mother.
I don't know, there's lots of things that could go wrong. So, I'm trying to find a balance between how much we want to mask, how much we want to anonymize, and how much it's okay to release. So, I'm still trying to figure this out. This is a progress in work.
Leverage privacy laws like GDPR to request deep social media data exports3:57
So, some really great sources of information for this are your social networks.
Unfortunately, it's also a lot of information that you probably don't want to share too widely. And on top of that, they make it really, really hard to get that information. Facebook has set up a really...
hard firewall preventing you to get access to a lot of information. However, there's the PDPA, GDPR, which forces them to give you your information.
All of it. An incredibly deep and rich amount of it. More than you could get from just scraping posts. So, one of the things I did was...
No, that's not it.
Okay. The next tab, with the Facebook. Not that other thing. Okay, well, I'll get to it. This is kind of a ramble. So...
When it resized the screen, I lost all
the screens I had lined up.
Sorry about that.
Raw social media data exports are often scattered and unreadable5:34
Anyways, you can go to Microsoft and you can say, hey, give me a dump of your information. You can tell it what part you want, you can tell it the whole thing. I went and I said, give me all your Facebook messages. So I wanted to see who I talked to, what I talked about, and what my graph of people that I interacted with was.
Facebook did exactly what they said,
and they gave me a full export.
So you wind up getting things like this.
It's just randomly scattered files and folders, and they might have IDs that link to other files.
It's really hard to figure out what it is. It's not human-readable. So what was actually quite easy was...
I went to Claude and I said, okay, read all these files I have in here. Make sense out of it. Build a graph that makes it easy for me to understand.
Use AI to organize raw data into relationship graphs before scrubbing it offline7:02
And it did. It came back with a graph. It came back with an index of all the chats I had, who it was with, and links to conversations with them going back to the very first chat I had on Facebook.
Ah, here we go.
So I can go in, and it actually created a nice index of all the conversations I had, who I had, how many messages, first message, last message, was it marketplace, was it group, was it personal. So I've got this nice index. I can type in a name.
And yeah, 266 messages with you.
That was with me? Sorry? That was with me? 366 messages with me? Yeah. But we don't message on Facebook very much. So then you have a list of all the conversations that you have with the people, and it creates nice bubbles that you can link through. So this is something that you can just kind of go through and look at, right? But the real power comes from... you've got all this information, then you can offline go through and anonymize it. You can get rid of API keys, credit card numbers,
phone numbers, emails, and really tighten up your data. You do that offline before you get to OpenClaw. You do that offline then you put it on, and then you've got a graph that you can start to query.
You can go beyond that, you can go to all the chatbots and ask for your information from them.
Summary: Download, organize, offline scrub, and securely feed your AI9:23
Okay, so คุณ John, I actually need to cut you off here soon. But if I can just kind of do a very quick summary of the lessons here, is that one, we all want to personalize our experience, whether it's OpenClaw or other tools. There's a great opportunity of all this social data that we have there that we can download. But then there's a big risk of exposing personal information. So you'll want to scrub that information, and then for your own personal information, there's other types of tools we can use, like Microsoft Presidio, that can help you keep it also encrypted and private. Is that correct? Yeah, to summarize it, find out the information that's important for you to know about who you are. Find the sources of it, find quick ways to download it,
use tools to make sense of that, organize it in a way that makes sense. Then you can process it one more step so it's machine-readable. You want to scrub it. And then you scrub it. Then you toss it into OpenClaw. So then it can actually know you better but not having the PII available.
I know your secrets, yes. Exactly. Okay. If we can have a round of applause for คุณ John ครับ. Thank you.