Word embeddings in 2017: Trends and future directions

stablemap | 158 points

https://github.com/kudkudak/word-embeddings-benchmarks has a pretty nice evaluation of existing embedding methods. Notably missing from this article is GloVe ( https://nlp.stanford.edu/projects/glove/) and LexVec ( https://github.com/alexandres/lexvec ) both which tend to outperform word2vec in both intrinsic and extrinsic tasks. Also of interest are methods which perform retrofitting, improving already trained embeddings. Morph fitting (ACL 2017) is a good example. Hashimoto et al (2016) sheds some interesting insight on how embeddings methods are performing metric recovery. Lots of exciting stuff in this area.

serveboy | 7 years ago

No mention of StarSpace (from FaceBook) ? It figures, with the rapid pace of innovation these days.

StarSpace can compute 6 types of entity embeddings, of which word embeddings are just one type. It's a whole family of algorithms.

https://github.com/facebookresearch/Starspace/

visarga | 7 years ago

My question is what are they really good for.

I mean king = queen -woman + man

That's the kind of thing we have ontologies for.

This article mentions that word embeddings are useful inside translators, but from the viewpoint of somebody who wants to extract meaning from text, what use is something that doesn't handle polysemy and phrases?

PaulHoule | 7 years ago

I also think that there is still room for improvement for embeddings based on other contexts as pointed in the blog entry. Another example from this year is leveraging dictionary entries as external context - http://aclweb.org/anthology/D17-1024 ()

Selecting context words differently is also an option for improvement. Using dependency structures to "filter" out context window seems to work better than "filtering" using subsampling frequent words illustrate that there is room. We may see other solutions to select context words in the future, as a building block as it is. Especially lately with the StarSpace hype advocating the idea of general purpose - task-agnostic - embeddings.

Or we can also consider that the expected improvements are insignificant w.r.t. improvements with the model learnt on those embeddings for downstream tasks that may update embeddings especially for this task...

() disclaimer: I am a co author

cgravier | 7 years ago
[deleted]
| 7 years ago