Category Archives: Uncategorized
Docker: plugging into the jupyter/datascience-notebook
Again, quick notes for myself as this is subtly different depending on whether I am on a windows host or a linux host. Install Pulling down the datascience-notebook image is easy-peasy: $ docker pull jupyter/datascience-notebook Starting the notebook Again, this
Docker: plugging into the jupyter/datascience-notebook
Again, quick notes for myself as this is subtly different depending on whether I am on a windows host or a linux host. Install Pulling down the datascience-notebook image is easy-peasy: $ docker pull jupyter/datascience-notebook Starting the notebook Again, this
Docker – RStudio server
Quick note for myself – installing and starting RStudio via docker: install: $ docker pull rocker/rstudio starting: $ docker run -d -p 8787:8787 rocker/rstudio accessing (from browser): localhost:8787 username and password are: rstudio
Docker – RStudio server
Quick note for myself – installing and starting RStudio via docker: install: $ docker pull rocker/rstudio starting: $ docker run -d -p 8787:8787 rocker/rstudio accessing (from browser): localhost:8787 username and password are: rstudio
Using bcftools to merge and filter VCF files
GOALS to merge genotype calls from separate VCF files (e.g. one VCF file per sample) into one master VCF file with a column for each sample. and filter this master VCF file and extract regions of interest EDIT: have edited
Using bcftools to merge and filter VCF files
GOALS to merge genotype calls from separate VCF files (e.g. one VCF file per sample) into one master VCF file with a column for each sample. and filter this master VCF file and extract regions of interest EDIT: have edited

Quantum mechanics & purgatory
“Quantum mechanics theorises that there is unlimited number of universes. Which means, there is infinite amount of me making this talk and, infinite amount of you… listening. Think about that.” “Andrea”, Limitless, season 1, episode 18. #hilarious!

Quantum mechanics & purgatory
“Quantum mechanics theorises that there is unlimited number of universes. Which means, there is infinite amount of me making this talk and, infinite amount of you… listening. Think about that.” “Andrea”, Limitless, season 1, episode 18. #hilarious!

Manipulating VCF Files
In this post we are going to cover some very basic scenarios for manipulating VCF files. We will begin with a (very) brief description of the VCF format, look at how to extract targeted regions using tabix. In a future post

Manipulating VCF Files
In this post we are going to cover some very basic scenarios for manipulating VCF files. We will begin with a (very) brief description of the VCF format, look at how to extract targeted regions using tabix. In a future post
A quick sound bite on predictive analytics
Was reading this post, and this little quote jumped out: “I now know the probability of an outcome [and] what can be done to influence it in the direction that’s positive for me,’ whether that be preventing customer churn or
A quick sound bite on predictive analytics
Was reading this post, and this little quote jumped out: “I now know the probability of an outcome [and] what can be done to influence it in the direction that’s positive for me,’ whether that be preventing customer churn or

Signal extraction: fourier transform
Stretching my brain back a decade, I used to work a lot with fourier transforms (FT) which are foundational methods of analysis MRI / NMR data. FT is so powerful and versatile that is widely used across all physical sciences

Signal extraction: fourier transform
Stretching my brain back a decade, I used to work a lot with fourier transforms (FT) which are foundational methods of analysis MRI / NMR data. FT is so powerful and versatile that is widely used across all physical sciences
Adoption of R in Business Intelligence
Like’d this post on R & BI. Arguably, the initial learning curve for R is much much steeper than slick software like Tableau. But, I think this misses the point. The adoption of R is about embracing a different type
Adoption of R in Business Intelligence
Like’d this post on R & BI. Arguably, the initial learning curve for R is much much steeper than slick software like Tableau. But, I think this misses the point. The adoption of R is about embracing a different type
MySQL Restores: waiting… waiting… waiting…
There are gazillions of posts with basic instructions for using mysqldump. But unless you are looking for it, very few of these comment on the shockingly slow restore times. However, there are some good hints and tips: http://loopj.com/2009/07/06/fast-mysql-backup-restore/ http://vitobotta.com/smarter-faster-backups-restores-mysql-databases-with-mysqldump/ Interesting
MySQL Restores: waiting… waiting… waiting…
There are gazillions of posts with basic instructions for using mysqldump. But unless you are looking for it, very few of these comment on the shockingly slow restore times. However, there are some good hints and tips: http://loopj.com/2009/07/06/fast-mysql-backup-restore/ http://vitobotta.com/smarter-faster-backups-restores-mysql-databases-with-mysqldump/ Interesting
Thoughts on Kaggle Home Depot Interview
No original content here, but an interesting snapshot of ideas from the team that came third place in Kaggle’s Home Depot competition. Some thoughts on this: The team clearly put in a lot of effort to understand their data. Perhaps
Thoughts on Kaggle Home Depot Interview
No original content here, but an interesting snapshot of ideas from the team that came third place in Kaggle’s Home Depot competition. Some thoughts on this: The team clearly put in a lot of effort to understand their data. Perhaps