Statistics for Biologists - The Chi Squared Test

Statistics for Biologists – The Chi Squared Test



hi guys and welcome to another mr. pollak biology video this time we're looking at statistics this is the a two level Isis RM pers we're gonna start off by looking at the chi-squared test today so I hope you enjoy this this has been a quite hotly requested video so let's get stuck in so here's our objectives we're going to understand when to use chi-squared and we're gonna try and apply the chi-squared test and we're gonna interpret the results to see what on earth we're actually talking about so what is chi-squared well chi-squared is a statistical test that we apply to categoric theater so you've got data that's grouped in categories and you're looking at the numbers of individuals or number of outcomes within those categories then you're gonna want to use chi-squared and what it does is it looks for differences between what you observe and what you would expect either according to chance or another pre-existing theory so when do we use chi-squared well from your student data book or your student statistics book if you're doing a QA you get this in your exam in your ice ax so you have this great flow chart that tells you exactly when to use each of the three statistical tests that you have to be able to apply so here's chi-squared and basically are you counting things if you're counting stuff then you want to be using chi-squared it's quite easy just follow that through in the paper they will ask you why you've chosen it you can basically just write out what's in that box nice and straightforward so let's look at applying this our stages that we should go through for each of these each of this application stage should be we should always write the null hypothesis first work out what values we should expect apply the equation identify what our degrees of freedom are and then we should interpret our result let's look at the null hypothesis first what on earth is this thing called the null hypothesis basically it's saying that there is no known association between variables and the in this case we should expect to see no difference whatsoever between the results that we get and those that are expected by chance or pre-existing theory it does hit off pre-existing Syria but it definitely should say oh sorry so this can get quite confusing but if you remember to always say the idea that there's no difference in your null hypothesis you will be absolutely fine so our second step is to work out what we expect to get now if the chance of every outcome in our test is equal then all you do is you take the number of trials that you're going to do and divide by the number of possible outcomes if there's an unequal chance such as in genetics you're going to divide the number of trials in the same ratio as you'd expect the number of outcomes so let's have a look at a specific examples of this first one for the equal chance like rolling a die if you roll it 60 times the six possible outcomes so 60 rolls divided by 60 is 10 so 10 would be your expected value it's an equal chance like in genetics let's look at eye-color in heterozygous children or rather I call it in children of heterozygous parents so if that's capital B little B is heterozygous for eye color then according to Mendelian genetics 75% of your offspring would have brown eyes and 25% of your offspring would have blue eyes so if you're looking at a hundred children you'd expect 75 of them to be brown-eyed and 25 to be blue-eyed so the last thing we're looking to look at before we get stuck into applying the equation is this idea of degrees of freedom now with statistics that this sort of level we don't really to understand it in a great deal of detail only that in chi-squared the number of degrees of freedom is the number of possible outcomes minus one okay so basically it's how many other than the result that you get how many other possible outcomes could there have been so if you're rolling a die there are six possible outcomes five degrees of freedom 6 minus 1 is 5 tossing a coin two possible outcomes heads or tails one degree of freedom two minus one is one so here's our equation for chi-squared you get this again in your little booklet of statistical data from a QA so chi-squared is equal to the sum of oh minus e squared all over a where o is the observed frequencies from your data and E is the frequencies that we expect either by chance or due to pre-existing theory and I always think the best way to apply this is put your data in a table like this so we've got columns for observed columns for expected observed minus expected observed minus expected squared and then observed minus expected squared divided by E and then Sigma at summer in the corner and this way just helps you break down the equation into little bite-sized steps and it's easier for the example to spot if you go wrong somewhere and maybe give you some method marks it's easy for you to identify he's gone wrong somewhere as well so here we go how do we interpret our results well if the value that you get for chi squared is less than the critical value that you gain your data table which we'll look at later we accept our null hypothesis so if chi squared is less than the critical value there's not much difference between our results and what we'd expect by chance of pre-existing theory so our null hypothesis is absolutely fine however if our value of chi squared is greater than the critical value then we reject the null hypothesis we can't say for certain what's going on really all we know is that our data is different to what we would expect by chance or pre exact pre-existing theory so again how do we apply it we write the null hypothesis with work hardly expected we apply the equation we identify a degrees of freedom and we interpret the result let's do an example let's look at tossing a coin so let's imagine a student tosses a coin 100 times what's our null hypothesis going to be so remember no difference so there will be no difference between the number of heads and the number of tails that's fine and we're gonna set out a table like this there's heads and there's tails so let's imagine we get these results 46 heads and 42 tails we would expect 100 divided by 2 as our expected so 5050 we do observed minus expected and we square it remember that when you square a negative number you get a positive number and then we divide by the expected and we summed a lot in the corner so our chi-squared value for this example is 1.6 so then we go on to actually interpreting our results so for this we need this table from your student data sheet the chi-squared value is 1.6 we have one degree of freedom because coin has two sides two minus one is one we look at this row from the table and we compare our chi-squared value to that critical value so our value for chi-squared is less than the critical value at p is not point not five remember P is not pointing out five that means that there is a less hard work well that means that we're operating it 95 percent confidence so 95 percent at the time we're confident that our results are statistically statistically significant so our chi-squared value is less than the critical value so we accept the null hypothesis in this situation there is no difference between the number of heads and tails well there is no statistically significant difference let's do another more complex example this time using genetics so in this example we're going to imagine that scientist has crossbred to a GU tea mice so ague t is like stripey hair color and that is a heterozygous threat and he observes the following offspring 36 black mice which are homozygous recessive 93 agouti mice which are heterozygous and 48 brown mice which are homozygous dominant so we're asking according to Mendelian genetics are these are these results what we would expect so let's have a look at our know our null hypothesis there will be no difference between the number of collared mice or the number of different quality mice and those expected by Mendelian genetics so that's how I know no difference between what we see and what we expect nice and straightforward but how do we work out what we would expect well we need to do a genetic cross so there's our little punnett square so all these the results we would expect if we cross two heterozygous mice together we'd expect 25% of the offspring to be homozygous dominant 50% to be heterozygous and 25% to be homozygous recessive now to work out the expected numbers we need to apply these percentages or these ratios to the actual numbers involved in our experiment so if we sum the numbers of offspring that we see in the experiment 36 plus 93 plus 48 that gives us a 177 so 25% of brown would be forty four point two five fifty percent of agouti would be eighty eight point five and 25 percent of black would be forty four point two five again so there the results that are going to go in our expected column in our table so let's have a look at that table we've got brown agouti and black there our results that we observed that's our expected we smash through the statistics the calculations observed minus expected square it divided by the expected and sumit in the bottom right so our chi-squared here is two point not eight five so now we need to analyze that so we need to we need our critical value table again there it is so our chi-square value was two point not 0 85 our degrees of freedom we have three possible outcomes so that's two degrees of freedom remember n minus one so we need this row as our critical value and again our value of chi-square here is less than the critical value so once again we accept the null hypothesis so let's summarize this a chi-squared test looks at whether your data is significantly different to expected values and it applies to categoric data if the chi-squared value is less than the critical value we accept the null hypothesis and if our chi-squared value is greater than the critical value we reject the null hypothesis thanks very much for watching guys I hope that's been helpful we'll come back and look at the other two statistical tests that Spearman's rank and and 90 standard error with 95% confidence limits later on both the time being like comment and subscribe thank you very much

19 thoughts on “Statistics for Biologists – The Chi Squared Test”

  1. In spearman's rank if i were to accept the null hypothesis because the value of r is less than the critical value would i say that; this means there is a greater 5% probability that results occurred due to chance? Thanks sir!

  2. Hi, their could you explain to me what you mean by chance.
    Chi squared:
    – if you reject the null hypothesis what does that mean in terms of chance/ probability (i.e p<0.05 or p>0.05) and what does it mean p mean in context to the question.
    How would you know if its above p is above or below 0.05?
    -If you accept the null hypothesis what does that mean in terms of chance/probability. 
    (i.e p<0.05 or p>0.05) and what does it mean p mean in context to the question.
    How would you know if its above p is above or below 0.05?

    Spearman rank correlation coefficient
    – if you reject the null hypothesis what does that mean in terms of chance/ probability (i.e p<0.05 or p>0.05) and what does it mean p mean in context to the question.
    How would you know if its above p is above or below 0.05?
    -If you accept the null hypothesis what does that mean in terms of chance/probability. 
    (i.e p<0.05 or p>0.05) and what does it mean p mean in context to the question.
    How would you know if its above p is above or below 0.05?

    Standard error and 95% confidence limits
    – if you reject the null hypothesis what does that mean in terms of chance/ probability (i.e p<0.05 or p>0.05) and what does it mean p mean in context to the question.
    How would you know if its above p is above or below 0.05?
    -If you accept the null hypothesis what does that mean in terms of chance/probability. 
    (i.e p<0.05 or p>0.05) and what does it mean p mean in context to the question.
    How would you know if its above p is above or below 0.05?

Leave a Reply

Your email address will not be published. Required fields are marked *