Jump to contentJump to search

Topology of Word Embeddings: Singularities Reflect Polysemy

Abstract:

The manifold hypothesis suggests that wordvectors live on a submanifold within their am-bient vector space. We argue that we should,more accurately, expect them to live on apinchedmanifold: a singular quotient of amanifold obtained by identifying some of itspoints. The identified, singular points corres-pond to polysemous words, i.e. words withmultiple meanings. Our point of view sug-gests that monosemous and polysemous wordscan be distinguished based on the topology oftheir neighbourhoods. We present two kindsof empirical evidence to support this point ofview: (1) We introduce a topological meas-ure of polysemy based on persistent homo-logy that correlates well with the actual num-ber of meanings of a word. (2) We propose asimple, topologically motivated solution to theSemEval-2010 task onWord Sense Induction& Disambiguationthat produces competitive results.

Click here to read the pre-print of the paper

Kategorie/n: Dialog Systems and Machine Learning
Responsible for the content: