# Point Biserial Correlation in R (what goes wrong?)

#### Unbiased_Togepi

##### New Member
Somehow the values of my point biserial correlation are not between -1 and 1 which suggests something went wrong. What could it be? The point biserial correlation is the last step of the R script posted below. Maybe there is a function which can calculate rpb?

Under 'Download Dataset' on https://datadryad.org/stash/dataset/doi:10.5061/dryad.723m1 you can download 'MycoDB_version4.csv' which is the dataset which I used for my calculations. I could not attach the csv. If you don't want to risk the download then the input I can tell you that 'SelectFertPNum' is a dichotomous vector which consists of 0's and 1's and 'SelectEffSiz' is a continuous variable.

# Meta-analysis phosphor and AM (Joost Visser's Bachelorproject Biology)

#Install packages
install.packages("ltm")

library(sciplot)
library(tidyverse)
library(robumeta)
library(metafor)
library(dplyr)
# Biserial correlation library
library(ltm)

# Set working directory to the correct folder
setwd("/Users/joostvisser/Desktop/Visser_Joost_Bachelor_Project_Major_Biology_University_of_Amsterdam/Visser_Joost_AM_Data/Visser_AM_Raw_Data")

# Convert csv-file to data.frame
df <- read.csv("MycoDB_version4.csv",header=TRUE,",")

str(df)
nrow(df)

# Select relevant studies (Sterilized, AM-fungi and lab studies)
STERyesIndices <- 'STERyes' == df[,35]
AMIndices <- 'AM' == df[,36]
LabIndices <- 'lab' == df[,37]

# Replace TRUE indices with 1's
for(i in 1:length(STERyesIndices)){
if (STERyesIndices == 'TRUE') {
STERyesIndices = 1}}

for(i in 1:length(AMIndices)){
if (AMIndices == 'TRUE') {
AMIndices = 1}}

for(i in 1:length(LabIndices)){
if (LabIndices == 'TRUE') {
LabIndices = 1}}

# Generate vector where all three criteria are met by adding them up
SelectThree <- STERyesIndices + AMIndices + LabIndices

# Finding indices which meet all three criteria through the which function
SelectRowInd <- which(SelectThree == 3)

# Select the right entries of both explanatory and response variables
SelectFertP <- df[SelectRowInd,32]
SelectFertN <- df[SelectRowInd,32]
SelectCtrlMass <- df[SelectRowInd,18]
SelectTrtMass <- df[SelectRowInd,21]
SelectTrtReps <- df[SelectRowInd,22]
SelectEffSiz <- SelectCtrlMass / SelectTrtMass

SelectFertPNum = NA
SelectFertPNum[SelectFertP == 'Pyes'] <- 1
SelectFertPNum[SelectFertP == 'Pno'] <- 0

# Calculate Biserial Correlation
IndSelectFertPNumTrue <- which(SelectFertPNum == 1)
XpNum <- sum(SelectEffSiz[IndSelectFertPNumTrue])
XpDenom <- length(IndSelectFertPNumTrue)
Xp <- XpNum / XpDenom

IndSelectFertPNumFalse <- which(SelectFertPNum == 0)
XqNum <- sum(SelectEffSiz[IndSelectFertPNumFalse])
XqDenom <- length(IndSelectFertPNumFalse)
Xq <- XqNum / XqDenom

n <- length(SelectFertPNum)

Pp <- XpNum/n
Pq <- XqNum/n

XBarI <- sum(SelectEffSiz)/n

SumSq <- (SelectEffSiz - XBarI)^2

S <- sqrt(SumSq/(n-1))

rpb <- Xp-Xq/S*sqrt(Pp*Pq)

Last edited: