Domain Based Common words

week 1

Using text vectors bundle (CBOW) in HPCC to find the common words for any datasets. The idea behind using text vectors is it’s ability to map each unique token in the corpus to a vector to discover the relationships between words by analyzing word usage patterns. text vectors maps text words into a high dimensional vector space such that similar words are grouped together, and the distances between words can reveal relationships. In the first week of my internship, I review ECL language, run the HPCC platform on the virtual machine and run different examples in ECL.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s