# Logistic regression

3 minute read

Published:

In this short tutorial we will implement two different methods for finding the maximum likelihood estimate (MLE) of a logistic regression model (McCullagh and Nelder, 1989), using Python. The MLE will be obtained using the algorithms:

• Iteratively Reweighted Least Squares (IRLS)
• Expectation Maximization (EM) via Polya-Gamma data augmentation

We start by loading some useful Python libraries.

import numpy as np
from scipy.sparse import dia_matrix # For handling diagonal matrices
from numpy.linalg import inv        # For computing the inverse of a matrix


## Logistic regression

We start by describing the logistic regression model. Let $\textbf{y} = (y_1,\dots,y_n)$ be a vector of binary indicators depending on a set of covariates $\textbf{x}_i \in \mathbb{R}^p$, for $i=1,\dots,n$ so that

independently for $i=1,\dots,n$, with $\beta \in \mathbb{R}^p$. The inverse of logit function in the above equation is equal to

In Python, we can define the inverse logit function as follow

def invlogit(eta):
return(1.0/(1.0+np.exp(-eta)))


### Iteratively Reweighted Least Square (IRLS)

Let $\textbf{X} = (\textbf{x}_1,\dots,\textbf{x}_n)^T$ be the design matrix and let $\pi = (\pi_1,\dots,\pi_n)$ be the vector containing the probabilities of having a positive outcome. At the $t$-th step the IRLS algorithm iteratively updates the coefficient $\beta$ as follow

where $\textbf{W}^{(t)} = \text{diag}(\pi_1^{(t)}(1-\pi_1^{(t)}),\dots,\pi_n^{(t)}(1-\pi_n^{(t)}))$ and where $\text{logit}(\pi_i^{(t)}) = \textbf{x}_i \beta^{(t)}$. We proceed in this way until convergence.

In the following chunk of code, we provide a simple implementation in Python of these updating equations. We remark that this code might be highly inefficient and it serves only for didactical pourposes.

def IRLS_logit(y, X, max_iter = 500):

# Initialization
n,p = X.shape
W    = dia_matrix((n,n))        # W is initialized
beta = inv(X.T * X) * X.T * y   # OLS estimate as initialization

# IRLS
for iter in range(max_iter):
pi = invlogit(X * beta)                # Evaluate the probabilities
W.setdiag(np.multiply(pi,(1 - pi)).A1) # Set the diagonal
# Updating beta
beta_star = beta + inv(X.T * W * X) * X.T * (y - pi)
# Check for convergence
error = max(abs((beta_star - beta)/beta_star))
if error < 1e-10:
print("Convergence reached after",iter+1,"iterations")
return({'Estimate':beta_star,'Iter':iter})
# If the convergence criterium is not satisfied, continue
beta = beta_star
print("Maximum iteration reached without convergence")


### Expectation Maximization

Following Scott and Sun (2013) we can obtain a different iterative procedure based on a Polya-Gamma data augmentation. At the $t$-th step the EM algorithm iteratively updates the coefficient $\beta$ as follow

where $\Omega^{(t)} = \text{diag}\left(\frac{\tanh(\eta_1^{(t)}/2)}{2\eta_1^{(t)}},\dots,\frac{\tanh(\eta_n^{(t)}/2)}{2\eta_n^{(t)}}\right)$. We proceed in this way until convergence. The Python code is

def EM_logit(y, X, max_iter = 500):

# Initialization
n,p   = X.shape
Omega = dia_matrix((n,n))        # Omega is initialized
beta  = inv(X.T * X) * X.T * y   # OLS estimate as initialization

# EM
for iter in range(max_iter):
eta = (X * beta).A1                   # Evaluate the probabilities
Omega.setdiag(np.tanh(eta/2)/(2*eta)) # Set the diagonal
# Updating beta
beta_star = inv(X.T * Omega * X) * X.T * (y - 0.5)
# Check for convergence
error = max(abs((beta_star - beta)/beta_star))
if error < 1e-10:
print("Convergence reached after",iter+1,"iterations")
return({'Estimate':beta_star,'Iter':iter})
# If the convergence criterium is not satisfied, continue
beta = beta_star
print("Maximum iteration reached without convergence")


## Simulation study

We simulate some fictitious data to check whether our algorithms are working properly. We assume that

independently for $i=1,\dots,n$, with $n=50,000$. The entries of $x_i$, are sampled uniformly over $(-2,2)$. In Python, we get

n        = 50000
X        = np.matrix([np.repeat(1,n),np.random.uniform(-2,2,n)]).T
beta_sim = np.matrix([1,1]).T
pi_sim   = invlogit(X * beta_sim)
y        = np.random.binomial(1,pi_sim)


Then, we estimate the MLE via IRLS

out_IRLS = IRLS_logit(y,X)
out_IRLS['Estimate']


and via the EM algorithm

out_EM = EM_logit(y,X)
out_EM['Estimate']


In both cases we reached convergence in a reasonable time.

Tags: