Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Post History

71%

+3 −0

Q&A Why is spacy word vectors showing unexpected similar words?

Why is spacy word vectors showing unexpected similar words? Here is the code I am using: import spacy import numpy as np nlp=spacy.load('en_core_web_md') with open ('data/us.txt') as f: ...

0 answers · posted 5mo ago by Asia‭ · edited 5mo ago by celtschk‭

Question python-3 spacy word-vectors

#3: Post edited by

celtschk‭ · 2025-01-03T09:42:00Z (5 months ago)
Added the actual question to the post instead of referring to the title. Also, added proper capitalization and punctuation.

Copy Link

Raw

Markdown

~~why is spacy word vectors showing unexpected similar words?~~

Why is spacy word vectors showing unexpected similar words?

as the title says. here is the code I am using
```python
import spacy
import numpy as np
nlp=spacy.load('en_core_web_md')
with open ('data/us.txt') as f:
text=f.read()
doc=nlp(text)
sentence1=list(doc.sents)[0]
# print(sentence1)
your_word='country'
ms = nlp.vocab.vectors.most_similar(np.asarray([nlp.vocab.vectors[nlp.vocab.strings[your_word]]]), n=10)
words = [nlp.vocab.strings[w] for w in ms[0][0]]
distances = ms[2]
print(words)
```
~~the output I get no matter what word I put in the variable~~
```python
['anti-poverty', 'SLUMS', 'inner-city', 'Socioeconomic', 'INTERSECT', 'Divides', 'handicaps', 'dropout', 'drop-out', 'Crime-Ridden']
~~```~~

Why is spacy word vectors showing unexpected similar words?
Here is the code I am using:
```python
import spacy
import numpy as np
nlp=spacy.load('en_core_web_md')
with open ('data/us.txt') as f:
text=f.read()
doc=nlp(text)
sentence1=list(doc.sents)[0]
# print(sentence1)
your_word='country'
ms = nlp.vocab.vectors.most_similar(np.asarray([nlp.vocab.vectors[nlp.vocab.strings[your_word]]]), n=10)
words = [nlp.vocab.strings[w] for w in ms[0][0]]
distances = ms[2]
print(words)
```
The output I get no matter what word I put in the variable:
```python
['anti-poverty', 'SLUMS', 'inner-city', 'Socioeconomic', 'INTERSECT', 'Divides', 'handicaps', 'dropout', 'drop-out', 'Crime-Ridden']
```

#2: Post edited by

Alexei‭ · 2025-01-02T08:13:38Z (5 months ago)
code formatting

Copy Link

Raw

Markdown

as the title says. here is the code I am using
import spacy
import numpy as np
nlp=spacy.load('en_core_web_md')
with open ('data/us.txt') as f:
text=f.read()
doc=nlp(text)
sentence1=list(doc.sents)[0]
# print(sentence1)
your_word='country'
ms = nlp.vocab.vectors.most_similar(np.asarray([nlp.vocab.vectors[nlp.vocab.strings[your_word]]]), n=10)
words = [nlp.vocab.strings[w] for w in ms[0][0]]
distances = ms[2]
print(words)
the output I get no matter what word I put in the variable
['anti-poverty', 'SLUMS', 'inner-city', 'Socioeconomic', 'INTERSECT', 'Divides', 'handicaps', 'dropout', 'drop-out', 'Crime-Ridden']

as the title says. here is the code I am using
```python
import spacy
import numpy as np
nlp=spacy.load('en_core_web_md')
with open ('data/us.txt') as f:
text=f.read()
doc=nlp(text)
sentence1=list(doc.sents)[0]
# print(sentence1)
your_word='country'
ms = nlp.vocab.vectors.most_similar(np.asarray([nlp.vocab.vectors[nlp.vocab.strings[your_word]]]), n=10)
words = [nlp.vocab.strings[w] for w in ms[0][0]]
distances = ms[2]
print(words)
```
the output I get no matter what word I put in the variable
```python
['anti-poverty', 'SLUMS', 'inner-city', 'Socioeconomic', 'INTERSECT', 'Divides', 'handicaps', 'dropout', 'drop-out', 'Crime-Ridden']
```

#1: Initial revision by

Asia‭ · 2025-01-01T19:57:41Z (5 months ago)

Copy Link

Raw

Markdown

why is spacy word vectors showing unexpected similar words?

as the title says. here is the code I am using

import spacy
import numpy as np

nlp=spacy.load('en_core_web_md')

with open ('data/us.txt') as f:
    text=f.read()

doc=nlp(text)
sentence1=list(doc.sents)[0]
# print(sentence1)

your_word='country'
ms = nlp.vocab.vectors.most_similar(np.asarray([nlp.vocab.vectors[nlp.vocab.strings[your_word]]]), n=10)
words = [nlp.vocab.strings[w] for w in ms[0][0]]
distances = ms[2]
print(words)

the output I get no matter what word I put in the variable
['anti-poverty', 'SLUMS', 'inner-city', 'Socioeconomic', 'INTERSECT', 'Divides', 'handicaps', 'dropout', 'drop-out', 'Crime-Ridden']

python-3 spacy word-vectors

Communities

Post History