The idea behind word vectors is to represent natural language words like "king" in \(\mathbb{R}^n\) in such a way, that you can do arithmetic. For example,
$$\text{vec}(\text{king}) - \text{vec}(\text{man}) + \text{vec}(\text{woman}) \approx \text{vec}(\text{queen})$$
The Python library gensim
implements this.
Requirements
pip install gensim --user
pip install nltk --user
Example
An easy to use example is
from gensim.models import Word2Vec
from nltk.corpus import brown
model = Word2Vec(brown.sents())
model.most_similar(positive=["woman", "king"], negative=["man"], topn=3)
which returns
[
("scored", 0.9442702531814575),
("calling", 0.9424217939376831),
("native", 0.9412217736244202),
]
So the corpus is not that good, but it should work with a bigger one.