+ Reply to Thread
Results 1 to 2 of 2

Thread: Analyzing Tagged Content: Is a Multiple Regression Analysis the Right Approach?

  1. #1

    Question Analyzing Tagged Content: Is a Multiple Regression Analysis the Right Approach?




    Im working with a relatively small data set that consists of several hundred social media posts, key engagement metrics and up to 10 content tags that describe the image of each post. We leveraged Google Vision API along with a manual review to construct the tags. Ive linked to an example of what were working with here (http://imgur.com/vcZkWi9).

    What Im trying to do: I would like to leverage a statistically valid methodology to identify which one or more (in combination) of tags tend to perform the best across the data set. Its easy enough to look at an individual tag and calculate the mean of the KPI, but any suggestions on how to evaluate combinations of tags that yield high performance? It wouldnt necessarily need to be all tags in combination, but could be 3 out of the 10 perform the best.

    What approach would you recommend to understand what tags are most closely associated with the highest mean KPI score? Ive been debating whether a multiple regression analysis is best, but looking for some insight on this.

    Ive had a tough time finding any other resources online so any help would be greatly appreciated!

  2. #2
    Points: 4,664, Level: 43
    Level completed: 57%, Points required for next Level: 86
    kiton's Avatar
    Location
    Corn field
    Posts
    234
    Thanks
    47
    Thanked 51 Times in 46 Posts

    Re: Analyzing Tagged Content: Is a Multiple Regression Analysis the Right Approach?


    As far as I understood, KPI is your dependent variable -- one you are trying to predict, correct? If so, then note that it is of count type, i.e., it's a non-negative integer value. This implies using count models, such as Poisson or Negative Binomial regressions. Furthermore, your tags are basically words -- these you have to quantify somehow. Otherwise, I don't see a way to analyze them statistically. How many tags do you have? Possibly you could code them somehow, or at least use dummies (e.g., text: 1=yes, 0=no; cartoon: 1=yes, 0=no, etc.)

    You can also look at some machine learning algorithms (e.g., singular value decomposition[SVD]) to work with text (tags) data, but that is a slightly different ball game.

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats