SQLite select from based on Pivot of another table

Lazar

Phineas Packard
#1
Ok I have a database that includes two tables. A datatable that looks like:
Code:
     a           b           c           d            e           f           g          h           i
1  -2.1772586  0.54904839 -0.09064777 -0.30904735  0.902645747  0.25056056 -0.07167768  0.5477423 -0.06540162
2   0.8191437 -0.29184975  0.55046901 -0.14786375 -1.365494264 -0.04492767  0.53076043  0.6104837 -0.42441851
3   0.8557512 -0.08171739  0.14697551 -0.86271463  1.075134669 -1.95978783 -0.63262255  0.8531504 -0.58840064
4   0.3146546 -1.83335248 -0.94553173  0.29263775 -1.291303249  0.85612084 -1.38643377 -0.5012100  1.88951237
5  -0.4509690 -0.95023522  0.98457069  0.01282259 -0.002325744  0.48373935 -1.16434104  0.8342430 -0.40744770
6  -1.1979267 -0.01669569 -1.18147487  0.71695038 -0.320249844 -0.38894801 -0.08486285 -0.2450969  1.18439034
7  -0.4765806  0.15399498  0.22767422  0.29262747 -0.732538913 -2.43904768  1.20040915 -1.9010115  0.21889756
8  -1.7473071  0.98079991 -0.23776151  1.26167730 -0.643288192  1.36559614 -1.24053485 -1.0701088  0.30020588
9  -1.3483016 -0.91244697 -0.83976553  1.25269789 -0.709669386  2.08028882  0.38032189  1.9496645  0.38533128
10 -1.0991387 -0.61460572  0.75277683  0.91604754  0.847483549  0.12921561  0.56796830  0.3605553  0.69194487
and a metadata table that looks like:
Code:
  variable      descritpion
1        a some description
2        b some description
3        c some description
I want to select columns from the datatable based on the variable column in the metadata table. Such that I end up with:
Code:
    a           b           c
1  -2.1772586  0.54904839 -0.09064777
2   0.8191437 -0.29184975  0.55046901
3   0.8557512 -0.08171739  0.14697551
4   0.3146546 -1.83335248 -0.94553173
5  -0.4509690 -0.95023522  0.98457069
6  -1.1979267 -0.01669569 -1.18147487
7  -0.4765806  0.15399498  0.22767422
8  -1.7473071  0.98079991 -0.23776151
9  -1.3483016 -0.91244697 -0.83976553
10 -1.0991387 -0.61460572  0.75277683
Thus I need to do something like:

SELECT [pivot variable.metadata] FROM datatable;

Anyone know how to do the pivot of the metadata.variable column in SQLite?
 

bryangoodrich

Probably A Mammal
#2
So if I understand you correctly, you want to select some variable that meets a certain description (say c, d, f from a, b, c, ..., z) from metadata, and then select those columns from the data table that have columns matching those selected from metadata?
 

Lazar

Phineas Packard
#3
Correct more or less. The metadata is actually a view that selected cases from a larger metadatafile. So I have the view which consists of a description of all the columns I want from the data table. I then need to use the metadata view to select from the data table. I should note that the view is about 2000 rows.
 

bryangoodrich

Probably A Mammal
#4
So you need a way of saying "select columnX from data_table where X is rowX in metadata_table" right?

As I've said before, usually these sorts of pivots require manual case statements. Since SQLite doesn't have a built-in pivoting operation and I've never seen an easy facility for adding functionality besides building your own C/C++ API to handle it (which now seems like a good project to do!), I'd approach this from "how can R take this information and write my sql statement for me?" That's what I end up doing when I have to automate an ugly big statement from other information. I could prototype something, but not right now.
 

Lazar

Phineas Packard
#5
Yeah I am using R with RSQLite to write huge long queries. I suppose that will have to do. Not really ideal but ok I suppose.
 

bryangoodrich

Probably A Mammal
#6
Well either you'd have to write some low-level C code to work with the SQLite API and procedurally do the process you want or you'll have to let SQL figure out the best approach, even if it takes awhile (it will!), it'll get done with good memory management. That's the thing you can always depend on with databases lol The problem is coming up with a logic in R that will write your SQL for you, but I suspect it won't be too bad to do. I usually think of this problem in terms of regular expressions. I'll often do something like

Code:
sql <- "SELECT * FROM some_table WHERE some_field NOT IN (EXCLUDE_LIST)"
sql <- gsub("EXCLUDE_LIST", EXCLUDE_LIST_R_VECTOR, sql)
So think in terms of that. You want to create a big ugly sql statement that will fill certain values based on values in an R data type (you can pull that from a separate query). So you write a template string and then use a series (or loop or whatever) of gsub statements that will convert your template string into the specific sql statement. Once that is done (test it on a smaller case!!!), you just have to submit your query and wait awhile ... a long while lol