I am new to IRT and I have read 100s of papers still could not wrap around my head with

My question is suppose I have response data of 20,000 students on 10 items, now I want to estimate the item difficulty for those items.

To do so I have two options, either I can use CTT to calculate the p values(which is difficulty) or fit an IRT model to estimate the item difficulty...

Item Response theory In variance]]>

I am looking at the impact of an intervention on students' knowledge scores (continuous var).

- My study has a control group (50 schools) and an intervention group (another 50 schools) which are randomly selected (cluster RCT) from 6 countries. Knowledge scores are reported at baseline and at post-intervention for both intervention and control groups.

- Here is my model (standard linear regression):

+ Dependent var: post-test score

+ Key independent var: intervention (1) and...

How do I perform generalized estimating equation for clustered data in SPSS?]]>

I generated a random-effects logistic regression model and included within-person predictors by person-mean-centering.

Not all individuals in my dataset have multiple data points, i.e. the individuals gave interviews, and while a large number of them did multiple interviews, many only did one interview.

My research question is whether interviewer characteristics affect how interviewers measure respondent characteristics. To answer the question, I analyze the within-person effects of...

Random-effects regression with within-person predictors]]>

Normality: The sense I get is few concern themselves with this any more with large data sets. That suggests not reviewing it.

Heteroskedasticity. I am unclear what the importance of this is anymore. Some suggest just using White...

Importance of regression assumptions]]>

Dummy variables proc genmod]]>

I'm trying to do some stats on a small dataset but I'm a bit of a novice and could use some expert advice. I basically have 5 samples each from 6 different participants, which have been stored under different conditions before testing. I am comparing to look at the impact storage condition has on the eventual test result. So for each sample I have a 'control', tested on the day it was taken, one stored for 7 days at room temp, one stored for 7 days at 37C, one stored for 14 days at...

Comparing groups - using ANOVA but unsure on various points, help greatly appreciated!]]>

sum(y_i*theta_i), but I'm confused as to why c(y_i,phi) is not included, as this is also a function of the range.

Basically, I am wondering why the orange squared part is not part of the kernel along with the pink square. Can someone help, please? Thank you.

]]>

average should be located to minimise the fraction nonconforming. What would

the value of the fraction nonconforming be under these conditions? (data are normally distributed)

Variance = 77.92

Upper Limit = 20

Lower Limit = 0

\(minize P[ (20-\bar{x})/77.92 < Z < (0-\bar{x})/77.92 ] = minimize( P[Z> (20-\bar{x})/77.92] + P[Z < (0-\bar{x})/77.92 ]) = ...???\)

]]>

I have a questionnaire that looks at peoples willingness, motivators and barriers to using anxiety-focused apps. For each section the participants are shown a list of statements and are asked to what extent they agree with it (Strongly Disagree, Disagree, Neutral, Agree, Strongly Agree).

From all of the results I have been able to create a summary table - below is an example of this summary table for the Willingness section using my pilot data (please ignore spelling mistakes, this...

Ordering Results from Likert Questions]]>

I've been a bit stuck lately on which test to choose or to implement for an assignment. Little bit of context : from a dataframe containing answers of a survey from several hundreds of thousand of people, I needed to ask a scientific question then analyse the data and make inference from it. Since the data range from 1972 to 2012 with a lot of informations on the participant, I choose to analyze the difference of opinion on abortion through the years, and then based on either...

Comparison of proportions with several variables]]>

I'm looking for a method of doing correlation test on these two measures:

Time to first fixation - time measures of finding an web object

Likert-type scales of satisfaction of ease to find.

It is inspired by this:

Can anybody help how I would be able to that (in SPSS)?

I'm not really an educated statistician. I've read something about the likert data has to be ordinal? - not sure I...

Correlation test on likert type scales and time measures]]>

I'm really struggling to interpret some of my results. I am exploring whether lie detection abilities can predict victimisation in autistic vs. neurotypical samples.

Firstly, I conducted a correlation analysis between the independent variables to check for multicollinearity. It suggested that the variables correlate differently between the diagnostic groups. In the autistic group, I found no correlations between the variables while in the neurotypical group, some of the variables...

Interpretation help]]>

I'm new to statistcs really and struggling a little to get my head around some aspects of the Johansesn test for cointegration.

I'm looking at the eigenvectors specifically, there are a number of columns, its my understanding that the ratios in the first colum result in the greatest conintegration, the second column the second best etc etc.

My question is how do I know which 'asset' to apply the hedge ratios to? I'm assuming each column shows the ratios for a different combination of...

Johansen cointegration test and eigen vectors]]>

But then I thought this. Say high ages lowers income (the DV) and regression shows this. Say we develop a new program to deal with age (we give them new training that leads them to be more successful). Then age...

Control variables]]>

This is what SAS's senior statistician said.

Russell,

As noted in replies to your post in the Statistical Procedures Community, the model with...

Linear Probability Model]]>

In practice, how do you draw a stratified sample with several stratification variables? So far, I am using Excel and it takes a lot of time. Do you use a specific software?

Thank you in advance!]]>

a) allows missing data AND

b) provides p-values?

Thanks a lot!]]>

I am trying to find out how to do a non-parametric Mann-Kendall trend test to detect monotonic trends.

here is some examples

Years: 2014, 2015,2016, 2017

X events: 43 (44.8%), 68 (42.5%), 55 (40.7%), 16 (19.8)

Y events: 51(53.1%), 87 (54.4%), 75 (55.6%), 64 (79.0%)

Z events: 1 (1%), 4 (2.5%), 4 (3.0%), 0

K events: 0, 1 (0.6%), 0, 1 (1.2%)

Thanks]]>

Does anyone know how to do such a test. I understand MAR is tied to being associated with the predictors and the MNAR with Y. But not how to do a formal test of this.]]>

Impact are the regression slopes for dummy variables.

I should say that the excluded reference group here is not a good idea to me, they are less than 16 of which we have extremely few and most likely they earn very little. I can not change it, it was decided by the federal government.

That said I don't see how every dummy variable can be positive. Some have to earn less than others. Is...

Interpreting dummy variables.]]>

I think of Fisher's approach like this:

We choose a test statistic whose distribution is calculated under H0. H0 being a simple hypothesis of preference

We break down the distribution according to significance thresholds

The significance thresholds are defined by what we consider to be an extreme result, that is to say a result which would happen very little and therefore would put us in doubt on the veracity of H0.

We calculate the p-value which corresponds to the probability...

Are Fisher's tests of significance mathematically correct?]]>

In my project, I have many variables but a very small sample (non-parametric). I'm trying to prove a link between two variables while correcting for covariates. In short, we are looking at white matter tracts and their links to visuospatial function and quality of life (QoL). We are analyzing 3 tracts, which can either be normal, displaced or ruptured (ordinal) and we have 8 scores for visuospatial abilities. Let's define...

Non-parametric test with confounding factors (many covariates)]]>

I work for a small games studio in Asia. Small enough that it's just me and the devs, so no mathematicians or data scientists on staff to deal with this kind of thing.

We're a B2B company so we don't control most of the sites our games appear on, rather we provide lists on suggested ordering to customers, which are essentially descending...

Bias type / sorting]]>

I want to find a way to quantify the similarity/dissimilarity of different data sets.

One data set contains data points (x,y) of one measurement serie. The picture shows 3 exemplary data sets with an exponential fit including a confidence interval of 68%. Which ways are most suited to compare data like that?]]>

Scenario: 3 treatments (Solutions 1, 2 and 3) with 4 levels each (Red, green, orange and blue). This creates 12 Petri dishes, each with a different treatment. Numerator degrees of freedom (3-1)*(4-1) = 6. In each dish we are measuring how much dye is absorbed in wooden pellets. There are 50 small pellets...

Degrees of Freedom]]>

thanks!]]>

I know how to calculate a Wald value but am unsure about how to use the output to calculate significance. I want to calculate if there is significance between two test scores:

Score 1: 5.38 Standard Error: .81

Score 2: 4.12 Standard Error: .83

df=10

I know that the Wald value is 1.09. What do I apply that to for understanding if these two scores are significantly different.

Thank you very much!]]>

Non-linear comparisons]]>

I have a high number of variables (around 80) with which to model an intermediate-size sample (around 50 points) using GLMs.

I would like to do an exhaustive search for the "best" model, but using all of the variables in an exhaustive (or semi-exhaustive, like glmulti's genetic algorithm) does not seem an option.

Could I somehow perform an exhaustive search of all, say 8-variable models, without having to write a custom script for that? Does anybody know of a package that allows...

Variable restriction in (exhaustive) model selection]]>

Probability of dying from Covid today]]>

THE WHOLE THING IS A DREAM

spoiler alert @Dason]]>

I am trying to perform an ordinal logistic regression (or at least I think I am) and I'm a bit stuck.

My variables are all categorical or ordinal:

DV: High, Medium, Low

IV: Parent Gender: 0,1

Child Gender: 0,1

Essentially, I am trying to understand if a parent's gender impacts the rating of their child based on their gender. Ex: If a parent is female (0) do they rate their female (0) children higher than their male (1) children, etc.

I need to determine whether the...

Ordinal Logistic Regression]]>

I need a simple and easy explanation for single and multiple nominal variables and the difference between them

thanks]]>

That evening all the numbers come up, and the brothers win the...

Probability Puzzle - Do angels exist?]]>

I am looking for knowledge with the following topic, I have a set of data in the attachment where I have a product A, with the current maximum rate of product 5,400 KG, however now I am trying to find new best production data, the rule is that to find a new best production rate I need to consider the best 7 days in a row and with that calculate the average. Calculate the average of the best 7 days in a row is not the question logically, the question is how to find out through a...

Finds best 7 best days in a row out of a set of data]]>

What is the probability of not randomly generating your cousin’s telephone number? Explain.

I understand the denominators - one 8 and 10 to the sixth power - but not the numerators. I don't understand why the...

'not randomly generated' what does the even mean?!?]]>

First for the predictors that have two levels, proc gen mod shows the results associated with the zero level of the predictor. I want to know the probability associated with being in the 1 level (for example being Hispanic when you are measuring if one is Hispanic or not). To do this I can just reverse the slope right? So if it says -20 for the 0 level of the predictor (not the DV) I can report 20.

I can not find in the log if its predicting the 0 or the 1 level of the DV...

proc genmod binary DV linear probability model]]>

So I have tested the same species of plant under 7 different conditions.

1 = Control (no fungus)

2, 3, 4 = Fungus A at 3 different concentrations.

5, 6, 7 = Fungus B at 3 different concentrations.

I am going to undertake T-Tests to compare my results.

I just want to make sure I have the tails and type of T Test correct.

I have undertaken a 1 tailed paired T - Test.

(because I want to know if the mean of 1 strain...

T-Test help!! Which tail and which type.]]>

( I have also posted in https://stats.stackexchange.com/questions/535284/odds-ratio-or-other-technique)

Before I start I am not well versed in posting on forums so please be patient if I'm going against convention. Plus I am on a steep learning curve with statistics.

I have been presented with some data which originates from a survey filled in by workers who work in the chemical factories (made up data is attached in an excel spreadsheet to illustrate the question - in...

Odds ratio or other technique]]>

I have also posted this in random effects model - meta - analysis - where to start - Cross Validated (stackexchange.com)

I've been given a task to investigate how to carry out the meta analysis on the results from a series of studies. They all report the 'standard mortality rate' with confidence intervals. An example is below ( I am assured that the studies have been selected in line with good...

meta analysis - where to start.]]>

I have a recipe with 6 components (the amount of each component used gives rise to 6 independent variables). The resultant product of this recipe is evaluated with a single numerical value (1 dependent variable). I have prepared and evaluated this recipe 11 times. So I have 11 data sets, each comprising the 6 independent and 1 dependent variable.

I would like to try to pull out of this data, if any correlations...

Not even sure how/where to start....some type of regression needed.]]>

My questionnaire has 24 likert scale items, asking students if they consider some issues right. The likert scale has 7 responses, as follows: It’s Never right , It’s Seldom right, Sometimes it’s right, It’s Frequently right, It’s Always right, I don’t know if it’s right or not, I don’t know what this means. The answers I have collected are more than 1000.

1. What do you consider the best statistical test to use?

2. Should I do a test for normal distribution, or is it not necessary for...

LIKERT SCALE - Which statistical test to use? Is a test for normal distribution necessary?]]>

I am running a GEE, the best distribution to my data is GAMMA. But Gamma does not accept value zero, which is the best result I could get on my repeated scores. Any suggestions?]]>