Motivating logistic analyses

Why linear analyses may be less desirable

Let's assume that, for all participants in a population, the probability of a correct response on each trial (theta) is 0.8...

theta = 0.8;

Participant 1 completed 2 trials of the task...

Ppant01.n = 2;

Given this, and the probability of a correct response on each trial, we can use the binomial formula to compute probabilities for observing either k=0, k=1 or k=2, correct responses to the n=2 trials...

Ppant01.k = (0:Ppant01.n)';

Ppant01.Pdf = binopdf(Ppant01.k, Ppant01.n, theta);

figure;

bar(Ppant01.k, Ppant01.Pdf);

ylim([0,1]);

xlabel('k: the number of correct responses');

ylabel('Pr(k): probability of observing k correct responses');

title('Participant 1');

Now let's assume there are two more participants who respond to n=3 and n=5 trials...

% Participant 2

Ppant02.n = 3;

Ppant02.k = (0:Ppant02.n)';

Ppant02.Pdf = binopdf(Ppant02.k, Ppant02.n, theta);

figure;

bar(Ppant02.k, Ppant02.Pdf);

ylim([0,1]);

xlabel('k: the number of correct responses');

ylabel('Pr(k): probability of observing k correct responses');

title('Participant 2');

% Participant 3

Ppant03.n = 5;

Ppant03.k = (0:Ppant03.n)';

Ppant03.Pdf = binopdf(Ppant03.k, Ppant03.n, theta);

figure;

bar(Ppant03.k, Ppant03.Pdf);

ylim([0,1]);

xlabel('k: the number of correct responses');

ylabel('Pr(k): probability of observing k correct responses');

title('Participant 3');

As we can see from the graphs, there is a probability of 0.640 that participant 1 gets 2/2 correct responses, a probability of 0.512 that participant 2 gets 3/3 correct responses, and a probability of 0.410 that participant 3 gets 4/5 correct responses. Indeed, this this most likely outcome.

So, let's use a linear approximation to estimate the value of theta (the probability of a correct response on each trial across the population). To this, we just 1) estimate the probability of a correct responses for each individual participant by dividing the number of correct trials (k) by the total number of trials (n), and 2) average these individual estimates across participants...

k = [2;3;4];
n = [2;3;5];
p = k./n
p = 3×1
    1.0000
    1.0000
    0.8000
pMean = mean(p)
pMean = 0.9333

We can even compute linear t-based confidence intervals for this estimate as follows...

pStdErr = std(p)./sqrt(numel(p));
pCI = pMean + pStdErr .* tinv([0.025;0.975],numel(p)-1)
pCI = 2×1
    0.6465
    1.2202

Whoops! Our confidence intervals include values of theta that are impossible since they exceed values of 1.

What is more, while it was most likely that participant 3 scored 4/5 correct responses, it was still quite likely that they would have scored 5/5 correct responses. If this had happened, our estimate of theta would be 1, and the confidence intervals would have a zero width.

What logistic analyses do differently

Rather than computing a single point estimate for theta for each participant and then averaging these point estimates together (as above), logistic analyses consider a range of plausible values of theta for each participant given their values of n and k.

This involves computing a likelihood function that, for each participant, represents plausible values of theta that were most likely. As above, we do this for the case where participant 1 gets 2/2 correct responses, participant 3 gets 3/3 correct responses, and participant 3 gets 4/5 correct responses.

Ppant01.Likelihood = @(theta) binopdf(k(1),n(1),theta) .* (n(1)+1);

figure;

fplot(Ppant01.Likelihood,[0,1]);

xlabel('theta: probability of a correct responses');

ylabel('L(k): likelihood of theta given the data');

title('Participant 1');

Ppant02.Likelihood = @(theta) binopdf(k(2),n(2),theta) .* (n(2)+1);

figure;

fplot(Ppant02.Likelihood,[0,1]);

xlabel('theta: probability of a correct responses');

ylabel('L(k): likelihood of theta given the data');

title('Participant 2');

Ppant03.Likelihood = @(theta) binopdf(k(3),n(3),theta) .* (n(3)+1);

figure;

fplot(Ppant03.Likelihood,[0,1]);

xlabel('theta: probability of a correct responses');

ylabel('L(k): likelihood of theta given the data');

title('Participant 3');

As we can see, the peak of these likelihood functions agrees with the point estimates for theta that we computed using the linear approximations above. However, the likelihood functions contain much more information in that they express a range of plausible values of theta. Furthermore, the width of this range represents how confident we should be in the values of theta for each participant - notice that this range is narrower for participant 3 because they responded to more trials and so their data are more informative. In contrast, the linear approximation weights data from all participants equally, regardless of the number of trials.

[It is also worthwhile noting that mixed-effects models can make further use of these individual likelihood estimates by allowing a model to estimate specific values of theta for each participant that systematically vary around a population average… but that is a story for a different time].

So, how do we combine the likelihood function from each participant to obtain a group-level likelihood estimate for the value of theta? Easy... we just multiply the likelihood functions together. When we do this, we get the following.

Group.c = integral(@(t) Ppant01.Likelihood(t).*Ppant02.Likelihood(t).*Ppant03.Likelihood(t),0,1);

Group.Likelihood = ...

@(theta) Ppant01.Likelihood(theta) .* Ppant02.Likelihood(theta) .* Ppant03.Likelihood(theta) ./ Group.c;

figure;

fplot(Group.Likelihood,[0,1]);

xlabel('theta: probability of a correct responses');

ylabel('L(k): likelihood of theta given the data');

title('Group');

Now, we can use some optimisers to obtain the maximum likelihood estimate for theta, as well as the 95% confidence intervals. These confidence intervals are taken to be the most likely values of theta that span 95% of the likelihood function (i.e., bound 95% of the area under the likelihood curve).

EstimOpts = optimoptions('fmincon','Display','off');
Group.MaxLikelihood = fmincon(@(theta)-Group.Likelihood(theta),...
    0.5,[-1;1],[0;1],[],[],[],[],[],EstimOpts);
fprintf('Maximum likelihood estimate for theta = %f;%c',Group.MaxLikelihood,10)
Maximum likelihood estimate for theta = 0.900000;
Group.CI95_Lo = fmincon(@(lim)(integral(@(t)Group.Likelihood(t),0,lim)-0.025).^2,...
    0.5,[-1;1],[0;1],[],[],[],[],[],EstimOpts);
Group.CI95_Hi = fmincon(@(lim)(integral(@(t)Group.Likelihood(t),0,lim)-0.975).^2,...
    0.5,[-1;1],[0;1],[],[],[],[],[],EstimOpts);
fprintf('95%% confidence intervals for theta = [%f, %f];%c',Group.CI95_Lo,Group.CI95_Hi,10)
95% confidence intervals for theta = [0.587220, 0.977167];

Notice how our confidence intervals are now properly bounded by 0 and 1. Also notice that the maximum likelihood estimate (0.9) is closer to the true value of theta (0.8) when compared to the linear estimate calculated previously (0.933). Specifically, the linear estimate produced a 16.7% error whereas the logistic analysis returned an error of 12.5%. However, it turns out that we can do even better than this!...

When running a Bayesian logistic analysis rather than a Frequentist logistic analysis, it most common to use an estimate called the 'conditional expectation'. The conditional expectation is basically the mean value of theta weighted by the likelihood function.

In this case, the conditional expectation produces a much lower estimation error than the maximum likelihood estimate, being only 4.17% off the true value. Not bad considering it is based on just 10 trials across 3 participants.

Group.ConditionalExpectation = integral(@(theta) Group.Likelihood(theta).*theta, 0,1);
fprintf('Conditional expectation for theta = %f;%c',Group.ConditionalExpectation,10)
Conditional expectation for theta = 0.833333;