Domain Based Common words

week 4

In this week I’ve Applied text vectors (CBOW) method in PubMed data sets, by represent each unique token in the corpus by hundred dimension I was able to find the center of the words that represented by hundred-dimension vector. I’ve used Euclidian distance to find the distance from center to every unique word in corpus. but Unfortunatly I’ve stucked here I did not get the expected results so I wrote a python code to compare the results between ECL and python. I found that in Python I’ve got what I need so We figured out that there is a problem in applying CBOW in ECL. I’ve met with Kevin and Roger many times to solve the problem and we were able to fix it.

source: https://towardsdatascience.com/an-implementation-guide-to-word2vec-using-numpy-and-google-sheets-13445eebd281

Leave a comment