Foresight | Machine Learning-Powered Glaucoma Detection

Glaucoma is a leading cause of
blindness and has no cure

60M affected
worldwide

Glaucoma is the leading cause of
irreversible blindness worldwide

$2.5B in annual
US healthcare spend

Glaucoma testing and treatment drives
10M physician visits in the US each year

1 in 8 go blind
even with treatment

Determining the speed of vision deterioration
early is critical to preventing blindness

Today glaucoma detection is
uncertain and friction-filled

Is the patient's vision getting worse ("progressing")?

Visual field tests that detect a patient's sensitivity to light at 54 distinct points are a critical factor in determining the right treatment strategy for glaucoma.

3 key reasons complicate interpreting results to determine progression:

Test Variation: Visual field tests depend on patient responses to flashes of light. Patient fatigue, lack of focus, or learning effects can cause variation in the results independent of any change in vision

Lack of a "gold standard" metric: Researchers have developed a range of metrics for assessing patient vision. Research has shown that these measures can often disagree leaving clinicians to make a judgement call between them

Too manual: Today clinicians line up paper printouts of visual field results to compare results over time and make assessments on pateient progression

Foresight aims to bring simplicity and certainty to glaucoma detection

More accurate detection

Leverage machine learning on 13K unique eyes across 5 leading US eye institutes

Streamlined data consumption

Eliminate the clutter to help clinicians determine the right treatment strategy

Put patient data in context

Give clinicians confidence in recommendations by placing data in relation to patient history and peers

Data

Understanding how data on patient vision is structured

Visual field of right eye
(total deviation values shown)

note: black rectangle represents blind spot

A visual field is the core data structure for our product

Numbers in the eye-shaped matrix represent a patient's sensitivity to light at 54-distinct points. Higher numbers represent stronger vision. There are three standard metrics on visual fields, which are included in our dataset:

Raw sensitivity: values of each tested point are listed in decibels in the sensitivity plot. Higher numbers mean the patient was able to see a more attenuated light, and thus has more sensitive vision at that location

Total deviation: values are deviations in sensitivity from the expected values for a specific age. Positive values represent areas of the field where the patient can see dimmer stimuli than the average individual of that age. Negative values represent decreased sensitivity from normal.

Pattern deviation: total deviation values corrected for generalized decreases in visual sensitivity. It is useful in cases where there is both localized depression due to glaucoma, as well as globally depressed vision across the eye due to other pathologies such as cataracts.

Our dataset consists of 831K visual fields from 177K unique patients spanning 5 US-based eye institutes.

Eyes from the Glaucoma Research Network visual field collection (831,240 fields from five sites without clinical data) were used

Dataset was filtered for patients with at least five reliable, SITA-Standard 24-2 fields resulting in 90,713 fields of 13,156 eyes included in the study.

Results

Benchmarking our model against leading research standards

200 expert-labeled patients used to benchmark our models

200 patients chosen randomly in each of three categories based on proxy label (50 unanimous "stable", 50 unanimous "progressing", 100 where algorithms disagreed)
Human ophthalmologist (glaucoma expert) examined visual fields of these patients and generated "ground truth".
Ground Truth: 134 out of 200 patients had "progressing" or "stable". The remaining 66 were boundary cases

Foresight classifiers have good F1 scores with lower class-bias

Encouraging F1 scores: Our top 4 classifiers had F1 scores in line with leading research algorithms (i.e., 0.90 or above)

Lower Class-Bias: Our classifiers did not overpredict "progressing" on patients identified as "too close to call" by the glaucoma specialist, while VFI and PLR did.

We focused on 2 key metrics to evaluate models

F1 score: the harmonic mean of precision and recall

Class-Bias: The term class-bias (overpredict one class) refers specifically to one form of the term bias as used in machine learning literature.

CLF. Strengths (Click for Details)

CLF. Metrics (Click for Details)

Second Approach

Using just the 134 labeled eyes

Click to see mean F1-scores and Standard Deviation

Can we use just the 134 labeled eyes to build a classifier?

Yes! We explored Support Vector Classifier (SVC), Random Forest and K Nearest Neighbots. We used 5-fold cross validation repeated over 1000 iterations to come up with an unbiased estimate of how well the model is doing.

Our best model is a SVC that uses just the 52 pattern deviation values and age to achieve a mean F1 score of 0.94

How can we avoid overfitting?

Given the size of the dataset, overfitting was a concern. To address this problem, we generated synthetic data.

The figure on the left shows a patient with six visual fields. We anchor the first two fields and drop fields three and four to create a synthetic data point, assuming that the final classification of stable / progressing still stands. By subsampling the visual fields of a patient we are able to generate more than 2300 points using only the 134 eyes. And we got an even better average F1 score of 0.95. We expect this model to generalize better to unseen data.

Web App

View demo

Glaucoma is a leading cause of blindness and has no cure

60M affected worldwide

$2.5B in annual US healthcare spend

1 in 8 go blind even with treatment

Today glaucoma detection is uncertain and friction-filled

Is the patient's vision getting worse ("progressing")?

Foresight aims to bring simplicity and certainty to glaucoma detection

More accurate detection

Streamlined data consumption

Put patient data in context

Data

Understanding how data on patient vision is structured

A visual field is the core data structure for our product

Our dataset consists of 831K visual fields from 177K unique patients spanning 5 US-based eye institutes.

Research & Analysis

Leveraging insights from clinicians, glaucoma research and machine learning techniques to develop Foresight

#1: User research

#2: Data filtering

#3: Data exploration

#4: Data normalization

#5: Label generation

#6: Model development

Results

Benchmarking our model against leading research standards

200 expert-labeled patients used to benchmark our models

Foresight classifiers have good F1 scores with lower class-bias

We focused on 2 key metrics to evaluate models

CLF. Strengths (Click for Details)

CLF. Metrics (Click for Details)

Second Approach

Using *just* the 134 labeled eyes

Click to see mean F1-scores and Standard Deviation

Can we use just the 134 labeled eyes to build a classifier?

How can we avoid overfitting?

Web App

Next Steps

Taking our models to the next level

Our Team

Combining forces to push forward thinking on data-driven glaucoma detection

Loris D'Acunto

Surabhi Gupta

Vikram Hegde

Amin Venjara

Our Advisors

Helping us chart the course and guiding us through the intricacies of applying machine learning to glaucoma

Dr. Osamah Saeedi, MD

Dr. Tobias Elze