Which text similarity algorithm should I use to compare the context of Instagram hashtags?

#1
For a study I am comparing companies based on the posts written by their Instagram followers. I apply the following technique:
  • Nike has 1.000.000 followers.
  • 2000 random followers of Nike are selected and the posts created during the last 365 days by these profiles are obtained. The posts of these 2000 Instagram followers form the "company follower profile" of Nike.
  • Company follower profiles are created for 50 different companies. The posts contained in the company follower profile of Nike are compared with the posts contained in the follower profiles of the other 49 companies to measure the distance between Nike and the other company.
Currently I simply created a "bag-of-hashtags" per company follower profile, consisting of all hashtags used in the posts. cosine similarity was applied to these bags of hashtags to measure the distance between each pair of companies.

However, I want to improve my study by applying the following methodology:
I would like to create a vector space model for all hashtags used in the company follower profiles to measure the 'distance' between all hashtags so I can measure similarities between hashtags. (total 300.000 hashtags)

To do this I want to create a document for each hashtag which consists of the posts in which the hashtag was used. After this I would like to apply a cluster algorithm to the vector space model to detect clusters of hashtags which discuss the same topic (i.e. sport, business, family, etc.),
The final goal is to compare the topic usage in the company follower profiles (i.e. how many times do Nike followers mention the topic sport) to measure the distance between Nike and the other companies.

My question with this (improved) methodology is: Which document similarity algorithm would you advise me to apply?

I found a Medium article describing many different document similarity algorithms: https://medium.com/@adriensieg/text-similarities-da019229c894. I recently also hear a lot about the new BIRD algorithm. However, I think methods such as BIRD may be a bit of an overkill in my practise.

Any advise on my question or my methodology in general are extremely welcome. Thanks a lot in advance!