rmongodb: Return count of all instances within a document

Lazar

Phineas Packard
#1
I have a mongodb with 21,000 documents. The fields include: _id, dateOfSpeech, speaker, speechType, and speechTranscript. I want to construct a query of the speechTranscript field that returns the count of the number of times a given pattern occurs in a document.

So far I have only been able to construct a query that returns documents where the pattern is present. The query looks like:

Code:
#Set up query with filter
reg_1 = list("transcript" = list("$regex" = "(I|i)ndigenous|(A|a)borig*|(T|t)orres*" ))
#Set up cursor to move through documents
cursor <- mongo.find(mongo, ns = "mongodevdb.transcript",
					 query = reg_1,
					fields = mongo.bson.from.JSON('{"primeMinister":1, "releaseDate":1, "title":1}')
)
#Pass query to mongo and convert to data.frame
res <- NULL
while (mongo.cursor.next(cursor)) {
	value = mongo.cursor.value(cursor)
	Rvalue = mongo.bson.to.list(value)
	res <- rbind(res, Rvalue)
}
#kill the query
err <- mongo.cursor.destroy(cursor)
#Return database
res <- as.data.frame(res)