nlp

  1. N

    Which text similarity algorithm should I use to compare the context of Instagram hashtags?

    For a study I am comparing companies based on the posts written by their Instagram followers. I apply the following technique: Nike has 1.000.000 followers. 2000 random followers of Nike are selected and the posts created during the last 365 days by these profiles are obtained. The posts of...
  2. P

    I'm a newb to charts, graphs, and R --Ubuntu 14.04, Rstudio, MySQL, CSV files, Data

    Hi guys, I just wrote an introduction with a couple of reasons as to why I am here. Reason 1 is that I'm not sure where the divide is between big data and just a big MySQL database. We have easily over 1 million records right now in our database and I'm not sure if that is considered big data...
  3. P

    Data Scientist

    Hello everyone and thank you for welcoming me into this community! I started my career as a Data Scientist exactly one year ago today. I love my job and I love what I do. I mostly create parsers and natural language processors so my day typically consists of talking with my AI programs and...
  4. bryangoodrich

    Let's crack open that document

    This is a thread where I'm going to do a quick prototype for exploring the Office Open XML format ... that docx thing Microsoft is using. You see, it's really just a ZIP file archive containing some XML stuff. R has all the tools we need to start digging into it! (1) Go Google your favorite...