pls clarify my doubt on when we get same mean for two different data set, then SD and median are equal?

I've checked with some sample data and got same numbers. pls explain the logic behind it.

forgive me, since i can't attach the excel file here

I'm doing a study on about 2000+ participants, and would like to test for normality of some continuous variables (I know from the histogram my age distribution is bimodal). I applied the Lilliefors test in R and got a p-value of <0.0001, but I'm rather confused about how to interpret this - this link suggests a low p-value confirms normality https://towardsdatascience.com/6-wa...al-distribution-which-one-to-use-9dcf47d8fa93 but everywhere else I...

How to interpret Lilliefors test?]]>

I'm trying to test the association between two clinical signs in a sample. They all have the disease but they do not all show these signs.

The ideal goal would be to test the diagnostic characteristics of sign A being found (which is dichotomous), but I imagine this is not possible to test since I do not have any healthy controls (all cases have the underlying disease)...

Statistical test choice]]>

I tried to calculate the minimum sample size for multiple linear regression.

I tried to check the sample size for predictors=4, effect size f=0.2/d=0.2, sig.level =0.05, power=0.8

1. When I checked the power of the entire model (F power) n=304

2. When I checked the power of one coefficient (t power) n=198

2. When I checked the power of one coefficient with Bonferroni correction(t power) n=281 (sig.level =0.05/4)

I probably doing something wrong as I get a smaller sample size when...

Minimum Sample size for multiple linear regression]]>

Is beta risk dependent on the alpha risk? If alpha risk is bigger, is beta risk then smaller? Or is it only dependent on size of the sample and variability.. I'm getting really confused:/

The company assembles a team of industry experts to determine if they are well-hedged across the range of possible occurrences. The task force consults industry information sources that are recognized standards, utilized by all the competitive financial-services companies, and the team...

Let's play ... "Guess the y-axis!"]]>

I am trying to figure out this probability problem for a board game with a drafting mechanic.

There is 26 characters. 10 is of the kind called A and 16 is of the kind called B. 6 characters are to be randomly picked for a drafting pool.

1. What is the chance of exactly four of the six being of kind A? How do you calculate this?

2. What is the chance of more than four of the six being of kind A? How do you calculate this?

I prefer to use excel or SPSS, but I can also download R or python if necessary. Thanks in regards.]]>

I'm looking for a formula that accounts for a timeshift that is distributed (weibull/lognormal)

For example.

I have "hits" that fall into a week.

W1: 1 1 1 1 1 1

Total hits W1=6

W2: 1 1 1 1

Total hits W2=4

W3: 1 1 1 1 1

Total hits W3=5

W4: 1 1 1 1 1 1 1

Total hits W4=7

Each week is ralated to a volume

W1 = 1000

W2 = 2000

W3 = 1500

W4 = 1700

The fraction would be Y(Wx) = Total hits / Volume

Now for each hit there is a time shift, lets say 1 week. And 1 week is...

Time shift with distribution]]>

I know that this is due to temporal autocorrelation since regressors that were close together in time have a huge covariance, but I can't quite understand what exactly the problem is with this and what to do about it. Isn't the AR model...

time series data and regression]]>

I have collected data on offenders’ number of offences (0,1,2,3,4,5) (DV) in the previous year in different correction centres (level 2 as site) and thought about using multilevel Poisson regression (e.g., GLMER in R) but a preliminary result showed that the data were overdispersed. I tried re-grouping the DV data into a binary variable (0 vs. 1+) and using binary logistic regression model, and...

Logistic or Poisson regression?]]>

If the relationship between two variables is linear and positive:

a) the constant has a positive value, greater than 3.5

b) the coefficient of variation has a negative value

c) spearman rank correlation coefficient has a negative value close to zero

d) the kendall rank correlation coefficient has a positive value]]>

spearman coefficient is used to determine the connection between

a) 2 normally distributed non-numerical variables

b) a numerical and a non-numerical ordinal scaled variable

c) 2 binary variables

d) 2 non-numerical (one is ordinally scaled) variables (one does not know the distribution form)]]>

The second column pertains to the number of

1. Ordinary One-Way ANOVA

2. One Way ANOVA with Repeated Measures...

What Statistical Treatment can I use in this type of data?]]>

Total 619 participants were given 4 types of test and then later on, based on their scores, were categorised as pass, clinical or double invalid. Can these data be used for statistical analysis? If so, can you suggest which test to use?

We want to know if scores from 4 different types of test can predict the outcome (pass, clinical or double invalid).

today I was doing a chi sq that resulted to be significant (p<0.05), but when I plotted the residuals they where all >1.96 or <-1.96.

I'm going to check the data again since I worry there might be something wrong in my file but I was just wondering if this is a possible scenario and if so how would you interpret the residuals?]]>

Stats are very much my weak point so apologies if this is a bit of a basic explanation! I've conducted a systematic review and would like to represent the data visually. I believe a forest plot is used for this however I have no common value statistical analysis from the studies which I think you need for this (please correct me if I'm wrong!).

The statistical analysis reported give me outcomes in the form of differing combinations of d test, df test, f test, t test and z...

Visual Representation of Systematic Review]]>

I'm a rookie in quantitative data analysis. I'm working on a survey of student responses across a number of schools. There are about 8 questions, and in the data there are YES's NO's but also quite few item non-responses dotted about. (I'm pretty sure it was a box for YES and NO, so some didn't cross either).

I'm doing some volunteer impact research and so I want to choose a defensible methodology to deal with item non-responses, but ideally I would choose a methodology that...

Item non-response]]>

There is no statistically significant relationship between fasting and weight loss.

1 The chart (see way below) is a visual representation of my data. The circled points indicate a prolonged period of fasting and shows a downward trend in weight (i.e. weight loss occurred) during these periods.

...Which correlation coefficient is best suited to dichotomous variables (and why do my results feel intutiteivly wrong)]]>

a. There are 2 groups, an experimental group (n = 21) and a control group (n = 22).

b. There are 7 dependent variables, each with its own measure. Some of the DV's are related to one another: a) there are 2 measure of mental shift/attention, b) 2 measures of working memory, c) 2 measures of fluid processing, and d)...

Please help settle a disagreement, One-Way vs. Repeated Measures ANOVA]]>

My supervisor provided me with a dataset of which I am computing the variables. In the questionnaire, items were provided with a 7-point Likert-scale (1 = fully disagree; 7 = fully agree).

In my data set, all DV value-labels are set up as below. However, the value labels seem strange to me. For instance, value "2" is skipped.

Should I makes all value-labels equal (1=1; 2=2; 3=3, 4=4; 5=5; 6=6; 7=7)? Or is this a normal way of labeling?

Value 1 = Label "Fully disagree"

Value 3 =...

Compute Likert-Scale Measures]]>

I'm quite beginner in this field but now my research requires some methodology and I thought to create a topic, maybe somebody had the similar issue before.

I have some data regarding to health-related features, including:

- BMI (scale)

- Current diseases (categorical)

- Physical activity (scale, how long the participant does sport in a week, in hours)

- Tobacco use (scale, how often the participant smokes in a day)

- Alcohol use (scale, consumed alcoholic beverages in the past...

Clustering of behavior related data]]>

class Gaussian(BaseKernel):

def _compute_weights(self):

if not self.fix_boundary:

return(1.)

weights = np.zeros(self.data.shape[0])

for i,d in enumerate(self.data):

weights

return(weights[:,None])

def __call__(self, x_test):

distances = x_test[None,:] - self.data[:,None]

pdfs =...

Gaussian kernel density weight question (in Python)]]>

Methods

Fixed-effect model (state fixed effect):

Y...

Fixed effects regression]]>

I am new in this forum. I am a french post-doc in marine ecology and I am especially interested in trophic relationships.

I am currently analyzing time series. I have to admit that time series are definitely one of my strongest Achilles heels, I am actually afraid of them

Anyway, lets go to the point. My question is the following: How to test that standard deviation (sd) change over the time in a time serie.

Lets take an example. one investigated the depth...

Test temporal change in standard deviation]]>

First post here. I was trying to use the life tables option under survival analysis in the spss. Somehow, it shows me only scale variables and none of the nominal variables! Which is surprising as the Status variable is nominal. I checked the coding in the variables section and all looks well and all the nominals show up in other analyses that i tried. ]]>

Which field of statistics does this fall under? I am trying to learn this.]]>

With the Patriots being cruelly eliminated from the playoffs, much to the delight of others, at least I have my numbers to watch and provide entertainment.

The nonlinear odds-to-probability derivations for the remaining non-Patriots teams were determined for winning the AFC and NFC Championship from current odds...

NFL postseason probabilities]]>

I'm struggling with finding an appropriate test for analyzing if responses are different based on certain properties.

In detail, participants had to recognize differences in images. Each participant received multiple images and in each image multiple differences were hidden. Each difference has certain properties and we would like to analyze if differences were recognized or not based on the properties of the differences. I was thinking of binomial logistic regression with the...

Choice of Test (Responses based on Properties)]]>

The Coefficient value for Hours Field is given 1.5046...

Would like to know Derivation of Coefficient for Below Dataset]]>

The Chebyshev empirical rule used for a normal variable, says that :

a) Approximately 90% of the values in the interval x+3s

b) Approximately 5% of the values are in the ... Interval

c) Approximately 95% of the z-values are in the Interval -2 and 2

d) Approximately 68% of the values are outside the interval -3 and 3]]>

30 medical students who have rated visibility of 40 brain anatomical structures (each structure has been rated on a Likert scale of 1-5, where 1 corresponds to "not visible" and 5- "clearly visible").

I am trying to figure out what will be the appropriate statistical test in this situation please to figure out the correlation among the ratings of the different students? I am not sure even if "correlation" will be the right test here, any advice for statistical analysis of...

What will be the right statistical approach for this design please?]]>

Based on the image attached above: If the p value is less than 0.05, does this mean the results show a significant influence of ecotype/salt conc. on rosette width. Or does it mean the null hypothesis can be accepted (with 95% certainly) if the p value is below 0.05?

It would also be appreciated if someone could tell me if my conclusion at the bottom of the image is correct.

I am using PAlaeontological STatistics (PAST).

I am a bit of a novice of this so for give my ignorance.

Essentially i have two groups of data. One is for people with exon 19 mutations the other for those with exon 21 mutations.

Each group has different sample sizes with variables including overall survival etx. The overall survival data is a continuous variable.

My boss has asked my to calculate the means with confidence intervals of each group which I have been able to do.

She also wants p values to compare the OS between the 2...

Obtaining P values for 2 non- parametric continuous variables]]>

I was wondering if you could help me with this. I thought I'd be able to work this out but my stats knowledge is very jaded now.

If the mortality rate from an illness is 2% and I want to conduct a trial comparing treatment X with placebo, with a view to finding out whether treatment X reduces mortality from that illness--- how do I determine what sample size I will need to be able to say whether or not treatment X has a mortality benefit.

Month 1: the person ran a two sample t-test to compare an average

Month 2: the person ran a two sample t-test with updated treatment and control groups

Month 3: the person ran a two sample t-test with updated treatment and control groups

Month 4: etc.

I know there are...

Interim Analysis]]>

I am not sure if this is the right platform for my question. But I don't have many options!! I am working on a project where I am reading a bunch of sensor readings (numbers).

For example, the initial values I observe are 160, 161, 162 !!! (160 being the least and 162 being the max)

On the occurrence of an event, I observe 163, 164, 165!!! (163 being the least and 165 being the max)

On the occurrence of an event again, I observe 166, 167, 168!!! (166 being the least and 168...

Increasing the marginal difference between values]]>

Recent new member here. Have already learned some really useful things from this forum by searching through old threads, so I'm glad I stumbled across this site! I have a question of my own that I was hoping to get some help with.

We've run a clinical study. Prospective observational study looking at patients undergoing invasive electrophysiology study and ablation (tubes being stuck in a vein in a patient's leg, taken round to the heart, and cauterisation inside the heart) for...

Test for independent associations between patient characteristics and events]]>