I have a large data set (23 million records, ~ 9 Gb) coming in R and am trying to figure out the best way to draw a sample from it.

The plan I have right now is:
1) Break down the dataset into smaller pieces of around ~ 4 million records or 1.5 gb
2) Draw a random sample from each
3) Combine the samples from the 6 datasets for the purpose of doing an analysis.

Does this method seem sound? Also, I have a rudimentary understanding of R - if anyone has advice on how to actually do this, please let me know.