I have 46 dissertations from students. I asked 46 students and 4 experts to grade the dissertations using 8-scale mark (1 ... 8)

Each student have to evaluate at least 10 different dissertations, and each expert have to evaluate at least 20.

In the end, each dissertation is graded by at least 10 students and 2 experts.

My test-hypothesis is: Are the evaluation (grading) of students and experts

**consistent**?

I tried to find the correlation between student's evaluation and expert's evaluation but I am not sure it is the right test. Can you please help me with the problem? What type of test I should use?

Regards,

Truong