Category Archives: Science@Work

Logistic regression: moving from significant results to meaningful insight

Association studies are a favourite tool for geneticists to understand the genetics that determine our health. It is simply routine now to test every mutation amongst your patients for association with a given trait (for example BMI or breast cancer).

Logistic regression: moving from significant results to meaningful insight

Association studies are a favourite tool for geneticists to understand the genetics that determine our health. It is simply routine now to test every mutation amongst your patients for association with a given trait (for example BMI or breast cancer).

SQL-like power with R’s data.table package

I had an interesting little problem today, that involved extracting data from one table based on information from another table. In SQL-speak, it was a full cross join with group by and a HAVING clause. It is a job that

SQL-like power with R’s data.table package

I had an interesting little problem today, that involved extracting data from one table based on information from another table. In SQL-speak, it was a full cross join with group by and a HAVING clause. It is a job that

Clustering Gene Expression Data: adding layers to a heatmap

So you have some data, perhaps a lot of data, but you’re not quite sure what to do with it… where do you start? Is it interesting data? Would it tell a good story, or just be a confusing mess?  This

Clustering Gene Expression Data: adding layers to a heatmap

So you have some data, perhaps a lot of data, but you’re not quite sure what to do with it… where do you start? Is it interesting data? Would it tell a good story, or just be a confusing mess?  This

Trial & Error – creating my first R package

  Creating my very first R package has been an interesting, frustrating and challenging experience. Overall, though the high level of documentation (largely from Hadley Wickham and co.) helped me overcome the frustrations and the challenges without too much pain.

Trial & Error – creating my first R package

  Creating my very first R package has been an interesting, frustrating and challenging experience. Overall, though the high level of documentation (largely from Hadley Wickham and co.) helped me overcome the frustrations and the challenges without too much pain.

Data warehousing & breaking the rules of a star schema

I would really appreciate some feedback on this. If you have a few minutes to spare and don’t mind sharing your thoughts – then I would like to hear from you. The Schema We are creating a data warehouse to store

Data warehousing & breaking the rules of a star schema

I would really appreciate some feedback on this. If you have a few minutes to spare and don’t mind sharing your thoughts – then I would like to hear from you. The Schema We are creating a data warehouse to store

SQL vs. BioMart for querying the human genome

A huge part of my job is to add context and build layers of information on top of the genetic mutation datasets that we have amongst our groups. If we want to understand the importance of genetic mutations on human

SQL vs. BioMart for querying the human genome

A huge part of my job is to add context and build layers of information on top of the genetic mutation datasets that we have amongst our groups. If we want to understand the importance of genetic mutations on human

Databases for finding human protein-coding genes

After approx. 4 months with the Merriman group, I am beginning to get a handle on how they operate and the typical questions that they are interested in. At a very high level, a typical workflow might involve finding interesting

Databases for finding human protein-coding genes

After approx. 4 months with the Merriman group, I am beginning to get a handle on how they operate and the typical questions that they are interested in. At a very high level, a typical workflow might involve finding interesting