outlier detection

  1. M

    Hello from a newbie

    Hi folks, Just checking in as new member. I am a geologist and interested in environmental statistics. I've taken no formal course in this field, so seeing experts in this forum gives me hope in learning more. (The jargon is just overwhelming!) If am able to contribute data and what I've...
  2. D

    Finding Significant Difference between Linear Regressions

    Hello. I have a scatterplot with two different sets of data plotted. Each individual set has its own linear regression formula and R-squared value. What type of test could I run on Microsoft Excel to get a p value to see if one dataset's trend is significantly different than that of the other...
  3. P

    finding the best fit and removing outliers with MARS regression

    I using the regression method called `MARS`, in `R` is it called `earth` and is located in the package `earth`, in order to find the best regression model for my datat. I know that this method is suitable for large data-sets, can handle `NA` and also decides which variables will be used and...
  4. M

    Multivariate Outlier Detection

    Hi all, I want to compare the results of "classical multivariate mahalanobis distance" scores of 5 variables for 2 million cases and "minimum covariance determinant based mahalanobis distance" scores at 0.001 and 0.01 level respectively. Also I want to see this difference at the plot. In...
  5. D

    cooks distance cutoff

    Hello, I urgently need to find out what is happening in regards to below: I am doing a huge number of simple linear regressions. For each regression I want to use outlier test (outlierTest(fit)) and influence index test and influence plots to identify outliers and influential data points. I...
  6. P

    Finding outliers on simple dataset?

    Hello everyone! I have not worked with statistics since high school, but unfortunately am facing answering this simple question which has been overwhelming me! I've been googling information on calculating outliers in excel, but explanations I have found have not accounted for cases where the...
  7. B

    Assigning outliers in very few datapoints

    Hi, I'm a geology PhD student and i have a problem with making my dataset statistically robust. I'm measuring element concentration of my samples using XRF. I have about 900 samples and measured each sample 3 times. Of course there is a variance in these 3 measurements belonging to each...
  8. C

    Statistical analysis on online betting industry?

    Hi! i'm currently working for an online betting company and i would need some help on how to come up with risk indications. I believe some statistical tools may help me assess the gravity of risk of a certain event or user. For example: 1. a certain player bets on an average of 100 per bet...
  9. V

    How to compute Cooks distance in R for subset of observations??? (multiple outliers)

    Hello, I really need some help here. Im working on my final thesis and I rieally need hepl to find out, if any of 2 subsets of obserations are multiple outliers (jointly influental) using Cooks distance. I for example lets pretend, I have data like this...
  10. B

    Two-dimensional outlier detection

    Hi all, I was wondering if you could give me some suggestions to detect two-dimensional outliers, i.e. for values depending on the variables. A simple example: 15 12 30 11 9 11 15 29 16 18 12 10 150 20 10 15 11 40 18 20 The 150 should be detected as outlier, whilst the other...
  11. TheEcologist

    [FAQ] How do I remove or deal with outliers?

    How do I remove or deal with outliers? Removing outliers can cause your data to become more normal but contrary to what is sometimes perceived, outlier removal is subjective, there is no real objective way of removing outliers. The problem, as always, is what the heck does one mean by...