Introduction

The goal of the Open Image Archives Committee is to make recommendations that have the potential to significantly improve the number, size, and quality of open image archives. This document provides a description of several open image archive “use cases” which will be used as a reference for the development and evaluation of recommendations.

Use Case 1: Algorithm Development

Imaging biomarker developers have a critical need to work with as large and diverse a collection of imaging data as early as possible in the development cycle. This spans a wide range of potentially useful imaging datasets including synthetic and real clinical scans of phantoms and clinical imaging datasets of patients with and without the disease/condition being measured. It is also important to have sufficient metadata (i.e. additional clinical information) to develop the algorithms and obtain early indications of full algorithm or algorithm component performance. To further illustrate the needs of algorithm developers, an example description follows outlining a set of data needed to develop an early stage lung cancer therapy assessment algorithm that performs a volumetric analysis of lesion burden in computed tomography scans. While many potential datasets would be useful in this setting, this list is intended to capture a core set of data for the development of a robust algorithm.

A core set of data needed to develop a CT lung cancer therapy assessment algorithm is:

(1) A set of CT images with known ground truth (e.g. FDA anthropomorphic phantom)

a. This dataset would ideally consist of real or simulated CT scans of a collection of physical objects with known volumetric characteristics.

b. A range of object sizes, densities, shapes, and lung attachment scenarios should be represented.

c. A range of image acquisition characteristics should be obtained including variation in slice thickness, tube current, and reconstruction kernel.

d. Metadata must contain the location and volumetric characteristics of all objects and any additional information on their surrounding or adjacent environment.

(2) A set of clinical CT images where outcome has been determined.

a. This dataset would ideally consist of longitudinal CT scans of a large and diverse collection of patients using many different image acquisition devices and image acquisition parameters.

b. The location and volumetric assessment of all lesions within each longitudinal CT acquisition must be established by an independent method, such as the assessment of multiple expert readers. This should include the localization and volumetric estimation of new lesions.

c. Metadata should at a minimum contain the location and independent volumetric assessment of all lesions, including the location of new lesions. Additional information on the variance of the independent volumetric assessment should also be available.

d. Additional metadata, such as the clinical characteristics of the patients (e.g. age, gender), classification of lung cancer (e.g. small cell) and lesion types (e.g. solid, non-solid), lesion attachment scenarios (e.g. lung pleura, major vessels), and lung cancer therapy approach, magnitude and duration, would also be useful to algorithm developers as they determine the strengths and weaknesses of different algorithmic methods.

All metadata should be stored in an electronic format easy to manipulate, such as within an XML schema.

Given a set of algorithm development data as described above, algorithm developers will typically subdivided the obtained data into an internal development collection and a set of data used to assess algorithm performance during development. Identification of two subsets of data that are similar in characteristics would also be of benefit to algorithm developers.

Use Case 2: FDA acceptance of imaging biomarkers as clinical endpoints in clinical trials

Imaging biomarker developers have a critical need to work with as large and diverse a collection of imaging data as early as possible in the development cycle. This spans a wide range of potentially useful imaging datasets including synthetic and real clinical scans of phantoms and clinical imaging datasets of patients with and without the disease/condition being measured. It is also important to have sufficient metadata (i.e. additional clinical information) to develop the algorithms and obtain early indications of full algorithm or algorithm component performance. To further illustrate the needs of algorithm developers, an example description follows outlining a set of data needed to develop an early stage lung cancer therapy assessment algorithm that performs a volumetric analysis of lesion burden in computed tomography scans. While many potential datasets would be useful in this setting, this list is intended to capture a core set of data for the development of a robust algorithm.

A core set of data needed to develop a CT lung cancer therapy assessment algorithm is:

(3) A set of CT images with known ground truth (e.g. FDA anthropomorphic phantom).

a. This dataset would ideally consist of real or simulated CT scans of a collection of physical objects with known volumetric characteristics.

b. A range of object sizes, densities, shapes, and lung attachment scenarios should be represented.

c. A range of image acquisition characteristics should be obtained including variation in slice thickness, tube current, and reconstruction kernel.

d. Metadata must contain the location and volumetric characteristics of all objects and any additional information on their surrounding or adjacent environment.

(4) A set of clinical CT images where outcome has been determined.

a. This dataset would ideally consist of longitudinal CT scans of a large and diverse collection of patients using many different image acquisition devices and image acquisition parameters.

b. The location and volumetric assessment of all lesions within each longitudinal CT acquisition must be established by an independent method, such as the assessment of multiple expert readers. This should include the localization and volumetric estimation of new lesions.

Use Case 3: Replacement of diameter measurement in RECIST by volumes and work towards FDA acceptance

When RECIST was defined (reference) it was the decision of the assessment group which measurement should be chosen as basis for the finally clinical classification of disease state development (progression, stability, response to treatment). Already at that time volumes were under investigation. Due to its impracticality at that time to make easy tumor measurements (highly manual process of tumor lesion markup; thick image slices that made a volumetric assessment imprecise)) diameter was the way to go forward. RECIST went through several rounds of refinement with regard to clinical validity of classification and usefulness in different phases of clinical drug development. Today RECIST1.1 is the accepted standard for clinical therapy response assessment in oncology trials . Meanwhile there have been further developments in the imaging techniques (higher spatial resolution, Thinner slices) and in algorithm development that make a volume measurement feasible. There are studies (reference to Merck) available that proof the higher sensitivity of volume measurements versus diameter measurements with regard to change detection. Therefore it is in the interest of the relevant stakeholders to work towards recommendations for a datasets that enable FDA to accept the use of volume instead of diameter for RECIST.

A core set of data needed to proof the validity of volume assessment compared to diameter measurements based on high quality, mixed (1-5mm) slice thickness CT images of lung cancer cases acquired in several clinical trials by different pharmaceutical companies and academic consortia:

(5) A set of clinical CT images with known diameter measurements and RECIST assessment.

a. This dataset would ideally consist of longitudinal CT scans of different clinical trials sponsored by pharmaceutical companies and comparable trials of publicly funded research (e.g. LIDC)

b. A range of disease state, therapeutic intervention and RECIST based clinical outcome should be covered.

c. All diameter measurements must be available.

d. A real life range of image acquisition devices and image acquisition parameters.

e. Metadata must contain a basic set of additional clinical data that support the clinical case identification and validation.

f. The location and volumetric assessment of all lesions within each longitudinal CT acquisition must be established by an FDA accepted method, like a multi-reader approach.

Example Use Case for Open Image Archives

Contents

Imaging Biomarker Roundtable

Introduction

Use Case 1: Algorithm Development

Use Case 2: FDA acceptance of imaging biomarkers as clinical endpoints in clinical trials

Use Case 3: Replacement of diameter measurement in RECIST by volumes and work towards FDA acceptance

Navigation menu

Search