Time invariant independent variable]]>

I have a data which represents some items indices (x axis) and the mean rating (y axis). I want to draw a trendline to understand the tendency of the data. Please look at the following figure. I used polyfit method in numpy, the r square is 0.05.

How can I interpret this? What else you suggest to understand the data more?

The confidence interval of the model, I used R confint(model,level=0.95):

2.5 % 97.5 %

(Intercept) 3.6288915640 3.7105907379

x...

data fitting]]>

Could someone explain to me in layman's terms the following:

I have complete data on a population size of ~37,000. I can show, across the entire population, that there are trends between certain variables and the amount of debt (e.g. age, length of tenure etc).

What I'm trying to do is to establish whether there is a statistically significant variation in the effect of these independent variables on debt between geographical location.

Partitioning the...

Sufficient sample size]]>

So what problems do I need to look out for, and what diagnostics should I use. Since the data will automatically be heteroscedastic I am not sure there is any point in testing for that.]]>

0.2 - small

0.5 - medium

0.8 - large

I know that Cohen's interpretation is only a rule of thumb, so I'm not sure if there is a "one" answer.

but what is the interpretation when d is between the values?

Is d=0.3 small? or medium?

Range interpretation

0-0.2 meaningless

0.2-0.5 small

0.5-0.8 medium

0.8 > large

(0.2+0.5)/2=0.35

(0.5+0.8)/2=0.65

Range interpretation

0-0.35 small

0.35-0.65 medium

0.65 > large]]>

In order to analyze the Odds Ratio, the values that were...

Are the sum values 'No' and 'Not informed' in the Odds Ratio analysis correct? See the example.]]>

I'm hoping I can get some help through this forum as I am new to the world of statistics and data analysis. There's a wealth of videos online but it's better to be able to talk these things through with people.

Cheers]]>

F=13.2 P= 0.003

what is tested

reject H0?

conclusions:]]>

I´m searching for a magical symbol I guess... The issue is following:

I´m trying to assign certain string of text a specific value in new variable. Imagine a situation when I am searching that string of text in eg. people´s opinions on whatever and my task is to separate those ones who used a specific swear word, in this case it´s (let´s say) "pencil". That´s not problem though. The issue is (and that´s what´s important) that my task is to separate people who used various...

syntax if variable includes more less specific string of text - search 4 some magic]]>

"To explore the relationship between personal characteristics and employment outcome rates, a multiple regression was used. Since many of the personal characteristic variables were coded categorically, a general linear model was used to run the multiple regressions."

Is this just another way of saying multiple linear regression with dummy predictors?]]>

For my thesis I'm doing a factorial Mixed ANOVA in which I look at gender differences. As it's a mixed ANOVA, everyone goes through the same conditions. The problem however is that I have 10 men and 26 women in my sample. Therefore the groups that I compare when looking at gender, aren't equal. I was wondering if there was a specific test I could do to calculate how problematic this difference in gender ratio is and if there is a test I could do to make up for this discrepancy...

Mixed ANOVA Unequal Gender Ratio]]>

I'm struggling to work out the right functions to use in compute variable to create a new variable. I hope you will be able to help me as I cannot process this in Excel as my data file is too large.

I am trying to create a new variable based on how another variable changes.

The data is related to how often a window open and closes throughout the day. As an example, I have the following data and I want to create the column in red as my new variable in SPSS...

Compute New Variable based on another variable]]>

I'm super bad at stats and i rly need your help.

So here's the pb : i want to compare 2 different ingredients (Ingredient A/ ingredient B). Each ingredient is described by 4 independant quantitative variables.

I would like to know if these 2 ingredients are significantly different or not depending on these 4 variables.

How do i check if my data follows a normal distribution? Which test can i use when i have 4 factors ? I know how it works but with 4 factors i dont know which test use...

Which test should i use ?]]>

I am studying the effects of framing on food neophobia and the purchase likelihood of novel foods.

Currently, I have made a pre-test where I have 4 conditions and 1 control condition where they are pictures with messages framed a certain way (emotion-promotion, emotion-prevention, information-promotion, information-prevention, and a neutral frame). I expose participants to only 2 of these conditions, followed by

What statistical analysis and how?]]>

I have 10 patients. We have some proprietary software that has used the data collected from a specific region X the hearts of each of the 10 patients and summarised it for us. Because of heterogeneity in patients, region X in one patient may have 40 separate...

Calculating a weighted mean/SD of x number of means/SDs?]]>

that is, the ration between 0-1 are infinite, logarithmic, but after 1 to ∞ the ration is not logarithmic.

for example the distance on the graph between 0-1 is 1cm; so the ratio of 1:1 is 1cm into the graph.

the ratio of 10:1 is 9cm distant away from 1:1 on the graph, but the reverse is only ~0.7cm away from the 1:1,

this gives a skewed perspective,

this effect probably...

basic question: surely that his graph is visually misrepresenting the data?]]>

In a scientific study they say that the low energy diet mice ate 30% less calories but ate the same amount of protein as the high energy diet mice, but the protein intake was in the same proportions in the low and high energy diets so I don't see how it's possible to eat 30% less calories and have the same amount of protein because you either eat 30% less calories...

i don't understand: mice ate 30% less calories but ate the equivalent amount of protein]]>

My homework relates to the

- A disease has a 2% prevelance rate in country X
- A test for the disease has a true positive rate of 0.999 and a false positive rate of 0.01
- Country X has a population of 50 million people with the two largest cities having a population of 2 million and 1 million, respectively.

Exercise on conditional probability (Bayes' Theorem)]]>

How do I determine the minimum number of observations needed per category in categorical predictor variables?]]>

Dependent variable: 2 groups, continuers and dropouts

Independent variables: some continuous tests (Wisconsin, Stroop, IPO, EDEQ etc) and age level (4 levels), education level (4 levels).

Which test should I use?]]>

Dimension

Q1 | x | x | x | x |

Q2 | x | x | x | x |

Q3 | x | x | x | x |

Q4 | x | x | x | x |

intervals

4 - x (not satisfied at all)

x - x

x - x

x - 16 (Totally satisfied)

how would you calculate the range?]]>

Ok I’m comparing the difference between a couple pediatric risk of mortality scores. I want to take the difference and see if telemedicine or telephone consults made a difference in risk of mortality.

2 of the 4 scores are so skewed we wanted to log transform them. So I log transformed them, then standardized them, then took the difference, then ran a regression. How do I know whether this is adequate? Or whether that’s too many transformations and I’m...

Strategy for diff indiff analysis]]>

My dependent variable is a 0/1 outcome (have health insurance or not) and my independent variables are age, sex, education, race dummy variables, immigration status, etc.

I want to see how these independent variables affect the dependent variable for each time period...

regression for repeated cross sectional data?]]>

Premise 1: A 'sampling distribution' of a statistic (e.g., a sample mean) is a piece of knowledge that tells us what we should expect a statistic to be (given some null hypothesis)

Premise 2: A bayesian 'prior distribution' is a piece of knowledge that we use to describe our degree...

Can a frequentist 'sampling distribution' be interpretted as a bayesian 'prior'?]]>

*The table in the image is just a piece of the original 324-line table.

]]>

Code:

```
library(smooth)
tsdatatr=ts(mydata$Spend,start=c(2014,12),frequency=12,end=c(2019,11))
tsdata=ts(mydata$Spend,start=c(2014,12),frequency=12,end=(c(2020,11))) # a training data set to choose the best model
esmtr<-es(tsdatatr, model = "ZZZ")
esmtr # it will show the chosen model
```

library smooth]]>

I am currently working with monthly data and I try to calculate confidence intervals for the monthly average. I have data from from 2010 to 2019 and there seems to be some seasonality.

The statement I want to make, is: In december 2020, we expect a value of x which lies with a certainty of 95% between y and z.

For the expected value x, I use the average december values of 2010 to 2019.

For the confidence interval, I am not sure:

My initial guess was to use all months (Jan-Dec) to...

Calculating confidence intervals for monthly data]]>

I have an issue of machine learning/anomaly detection. Indeed, I have a variable Y and several other variables X. The purpose is to quantify the degree of abnormality of the data on Y but I have to take into account the values on the other variables (the relationship between Y and X).

Normally, an anomaly detection algorithm would find anomalies but on the whole data (Y + X), but in my case I want to zoom in on Y because it is a very important variable. If I wanted to quantity the...

How to determine the abnormality of a specific variable by taking into account all the other variables in the data?]]>

How can I

Thank you]]>

i started to learn simple and multiple regression, and there is one thing i can't understand.

i used data frame with values of hindrance, inhibition, and negative effect.

when i predict the value of negative effect by simple regression, using inhibition alone, i get its coefficient.

but if i try multiple regression - predicting negative by both inhibition and hindrance, the coefficient of inhibition suddenly changed.

(from-> -10 to -> -12)

i thought the coefficient shows the change it...

difference in value of same coefficient.]]>

can you help me answering if there is a solution for this scenario or not? and if there is a solution can you explain why there is one? and what is different to a usual sequence of coin flipps.

OK here we go:

imagine a scenario:

you do Coin flipps:

- outcome is for each 50%; fair coin toss; H=Head; T=Tails

- in this scenario we determinate that formation of "HHT" will appear...

how to weight a coin toss]]>

According to this link,

https://data.library.virginia.edu/interpreting-log-transformations-in-a-linear-model/,

MLR Models: Log Transformation Interpretations and Collinearity]]>

The basic statistical model framework under WIOA is the fixed effect model specified as follows (Sutter [7]): Consider the linear model for our data observations grouped into states j = 1, ..., j, for each quarterly time period t = 1, ..., t: yjt = aj + βxjt + ε jt; ε ijt ~ N(0, σ y 2 ). (3) The effect of x on y, denoted β, is the primary quantity of interest. After accounting for the...

"Fixed effect" regression]]>

I've been working on a Stats problem for days and think I've gone wrong somewhere. Something in me says this problem is very straightforward, and I am over complicating it. Maybe someone can have a read of what I've done, and let me know if I'm on the right tracks.

(I am using SPSS)

I have 3 variables -

The question being asked is:

On the right tracks? One-way ANOVA]]>

I know that the chi squared test is an option, but this test does not tell me if the significant difference lies between ASA 1 and ASA 2 or between ASA 2 and ASA 3,...

Is there a statistical test I can use to know between which independent categories the significant difference is situated?]]>

I'm having trouble working out/finding an apprioriate statisical anayslsis to use. Basically, I tested a number of tadpoles in a maze, then tested the same tadpoles, as frogs, on the same maze.

As I don't care what the actual times are (eg; average time might be longer for tadpoles than frogs), I'm having trouble finding an apprioriate statisical anayslsis to...

Statisical test for whether better individual performance in set 1 means better performance in set 2?]]>

A pair is independent. Another pair dependent. another pair mutually exclusive.

i have crated joint probability distribution tables for

1st independent pair

2nd dependent pair

3rd disjoint pair.

i'm stuck at how to create joint probability distribution for all three X,Y,Z?]]>

I am trying to predict the type of parenting style of an individual using a multinomial regression model. In the model, I include an interaction between two nominal variables, but the output is difficult to interpret due to the many categories in both variables.

My interaction is between the country of origin (v1) and the migrant status (v2). The country of origin has 10 categories, while the migrant status has 3 categories (native, first generations, second generations).

Would...

Multinomial regression - Interaction with too many categories]]>

λ is a constant.

- Am I correct to understand that η represents the minimum salary that is taken into account, and that the probability for every single salary is in relation to it? i.e the bigger the minimum salary, the smaller the probability becomes for every salary in the function.
- How can I calculate an...

An estimator for a Cumulative distribution function]]>

1) What is the best test to compare these two groups to determine whether they are significantly different overall from one another (i.e., average happiness)?

2) Also, what's the best test to compare the trends seen in the two groups (e.g., males and females trending in different directions in terms of...

Need help with time series]]>

I was wondering whether anyone knew a solution within R for low signal smoothing of data. I have attached a screenshot showing what I require.

I know that MATLAB offers the curve fitting tool which helps with this, but I was wondering what kind of R function would be required to interpolate between points in the below example - a kind of moving average but dynamic depending on the proximity of values.

Any help would be greatly appreciated.

Thanks,

David]]>

thanks]]>

Currently, I am working on a project where I have a 3D set of data. This data has 2 independent variables and 1 dependent variable. I have figured out how to create a nonlinear fit surface for this set of data however, I am now trying to understand the assumptions and limitations associated with creating sed surface. Understanding the this assumptions and limitations will help me understand the overall accuracy of my fit surface to this particular set of data...

Multivariate Non-Linear Regression: Fitted Surfaces and Associated Errors]]>

I will try to be precise and concise.

I'm studying the impact of environmental-related technologies on the CO2 emissions reduction (dependent variable). To do so, I'm working on a panel data: 27 countries and 19 years (2000 to 2018). My panel is in a long format.

I have 19 independent variables:

- 9 subsets of patent applications which are highly correlated with each other because some technologies are part of several subsets
- 9 specialization indexes: one for...

problems with panel data model - correlations & time lags]]>