Syntax error: Error in sqliteExecStatement

#1
Hi guys,

I have the following problem:

I am trying to do a multinomial logistic regression on a very large dataset.
My dataset has 6 columns (5 independent variables and 1 dependent variable) and 176,483 rows (observations).

The function in R which does exactly what I need is multinom().
However, because my dataset is too large, R quickly runs into memory issues and gives me the following complaint: too many (1402960) weights and memory exceeded.

One suggestion I received to resolve this problem was to use a database system and execute SQL statements on it.

Now I am using the RSQLite package in R and I am trying to "translate" my algorithm into SQL statements.

This is where I encounter another problem - my code has a syntax error that I cannot correct. My code looks like this:


library(RSQLite)
m = dbDriver("SQLite")
dbfile = "multinomial.db"
con = dbConnect(m, dbname = dbfile)

f <- read.csv("data.csv", header=TRUE, sep=";")

if(dbExistsTable(con, "mydata")) dbRemoveTable(con, "mydata")
dbWriteTable(con, "mydata", f, row.names=FALSE, column.names=TRUE)

library(PivotalR)

script <- paste("SELECT madlib.mlogregr_train('mydata',
'mydata_output',
'Col6',
'ARRAY[1, Col1, Col2, Col3, Col4, Col5]',
0,
'max_iter=20, optimizer=irls, precision=0.0001')",
"SELECT * FROM mydata_output")

result <- dbSendQuery(con, script)


Everything runs OK until the last command. When I run the last line of code I get the following error:

Error in sqliteExecStatement(conn, statement, ...) :
RS-DBI driver: (error in statement: near "(": syntax error)


Do you know how to solve this error?
Any help would be very much appreciated!
 
#2
I guess, the problem is a type mismatch when using the variable script.

Try to type the sql code directly in dbSendQuery().

Consuli