Simulated datasets get a hard time in the real world, as it is difficult to build a simulation which accurately captures the range of values and “dirtiness” of real data. However, simulated sets cannot be beaten when testing out new methodologies. Right now, I am interested in simulating genotype-phenotype test sets for case-control studies. I spent the weekend reading about various simulation tools, and have decided to try GenomeSIMLA. I will try to keep the blog updated on how I go, but first of all I have to get the beast to compile.
The good guys at the Richie Lab (Penn State) have provided GenomeSIMLA as source code, and I have managed to get it to compile on Ubuntu 15.10. The configure script ran smoothly, but ran into trouble with make:
A quick google led me to the following stack overflow comment:
A closer look at the error message and I noticed:
template spinlock spinlock_pool< M > ::pool_[ 41 ]
Putting two and two together, I guessed that the definition of the constant M in the random.h script was causing issues with the boost library. So I did something a little naughty, and I edited the genomeSIMLA files to rename the constant M.
After running make, I got the following errors:
So I was one step closer, but not quite there yet. I jumped into the random.cpp file and renamed the M variable to be consistent with the declaration in random.h. Woohoo success! make and make install completed without errors.
Now to test the simulation software… will keep the blogs up-to-date.