Ask HN: Create embeddings efficiently for an AI notes app with E2EE

satyajeetjadhav | 3 points

> The user's notes must be unencrypted and readable as plain text on the server to create embeddings.

Consult a security expert before doing this, but here’s an idea: encrypt each word of the text, send the encrypted tokens over the wire, and then use an embedder trained on text encrypted with that method.

If you use an asymmetric encryption method, you could even throw away the private key.

The result still would be a substitution cypher on words, so it would not resist frequency analysis and it won’t help at all that, if your users manage to extract the key, they can encrypt text to figure out the mapping, but it would protect against people ‘accidentally’ looking at text of your users.

Periodically switching the encryption key wouldn’t be that hard.

Someone | 13 days ago

Which embedding model are you using?

Perhaps pick one with lower memory usage from this list?

https://huggingface.co/spaces/mteb/leaderboard

rahimnathwani | 13 days ago

I’m not sure if there are implementations for browsers, but look into embeddings with homomorphic encryption.

innethread | 14 days ago