client logo
Version: 17.0.0 | Published: 30 Oct 2023 | Updated: 562 days ago
ind-dataset-logo

Genomics England - Secondary Data - Cancer Specific Curated Datasets - Pilot

Dataset

Documentation

Description:
Genomics England are striving to improve the clinical data provided for its researchers. We understand the value of accurate and granular clinical data, especially in the context of cancer. In order to deliver this, we are planning a series of pilot datasets, aiming to incorporate additional clinical data provided by Public Health England cancer registry (NCRAS). Genomics England will aim to deliver cancer specific datasets, with the initial focus being on providing a broad pathological understanding. This will aim to incorporate data points such as molecular mutations and resection margins in pathology reports. The focus will then incorporate radiological imaging reports and finally focus on live/ up-to-date clinical data. In addition, we are also including the date each participant was last seen alive (data provided up to October 2020) and dates and causes of death to aid with outcomes. It must be stressed that this work is a development process, and we are working in unison with NCRAS to progress this. Whilst we do not possess the extensive experience and resource of Public Health England, we are developing a natural language based algorithm for focused data extraction. NCRAS have a dedicated team to curating clinical data and the gold standard remains the NCRAS curated tables. However, for this dataset to improve and move forward, Genomics England are keen for feedback and for you to highlight areas for improvement. You will note subtle differences to the structure of the table compared to the curated NCRAS tables and thus additional data dictionaries have been provided. Genomics England hopes to continue developing this uncurated live dataset with feedback and look forward to hearing your thoughts. Please reach out to us with related thoughts and suggestions via the Genomics England Service Desk, including "cancer_specific_datasets_pilot" in the title of your enquiry. With the addition of the new pathology_reports dataset introduced in v16, the aml_path_reports and testes_path_reports datasets have been deprecated in v17.

Coverage

Typical Age Range:
0-152

Provenance

Temporal

Accrual Periodicity:
QUARTERLY
Distribution Release Date:
30 March 2023
Start Date:
09 April 2008
End Date:
30 December 2022

Accessibility

Access

Access Service:
More information about the Genomics England Research Environment can be found here: https://www.genomicsengland.co.uk/about-genomics-england/research-environment/ https://research-help.genomicsengland.co.uk/display/GERE/1.+The+Genomics+England+Research+Environment Genomics England 100k participants have consented to longitudinal lifetime followup and recontact safely through our clinical network. BRST (Bioinformatics Research Services) are a team of bioinformatics who know the dataset inside out and provide consultancy projects on a case by case basis. Our network of clinical and medical experts can be made available on case by case basis. Researchers have the opportunity to work with our and access the GeCIP network who are a community of world-leading experts in specific cancers and rare diseases.
Access Request Cost:
Fees will be dependent on the type of access that is necessary. Raw data is not eligible for export. Summary-level data may be exported provided that it is approved through the Genomics England Airlock Process
Delivery Lead Time:
2-6 MONTHS
Jurisdictions:
GB-GBN
Data Controller:
PUBLIC HEALTH ENGLAND
Data Processor:
GENOMICS ENGLAND

Format and Standards

Languages:
en
Formats:
Multiple Formats Available

Enrichment and Linkage

Observations

Statistical Population
Population Description
Population Size
Measured Property
Observation Date
Findings
Rare Disease - Number of genomes
73,517
Count
30 March 2023
Findings
Cancer Germline - Number of genomes
32,753
Count
30 March 2023
Findings
Cancer Tumour - Number of genomes
17,003
Count
30 March 2023
Persons
Cancer Participants
15,624
Count
30 March 2023
Persons
Rare Disease Participants
72,874
Count
30 March 2023