client logo
Version: 17.0.0 | Published: 30 Oct 2023 | Updated: 571 days ago
ind-dataset-logo

Genomics England - Cancer

Dataset

Documentation

Description:
Cancer data are presented for either the patient level cancer diagnosis or “disease type” or the tumour specific sample details of participants in the Cancer arm of the 100,000 Genomes Project. Data Relating to Cancer Participants: cancer_participant_disease For each cancer participant in the 100,000 Genomes Project, this table includes data about their cancer disease type and subtype. cancer_participant_tumour For each cancer participant’s tumour in the 100,000 Genomes Project, this table contains data that characterises the tumour, e.g. staging and grading; morphology and location; recurrence at time of enrolment; and the basis of diagnosis. cancer_participant_tumour_ metastatic_site For each cancer participant in the 100,000 Genomes Project, this table contains the site of their metastatic disease in the body (if applicable) at diagnosis. cancer_care_plan For a proportion of cancer participants in the 100,000 Genomes Project, this table contains information from their NHS cancer care plan on their treatment and care intent, in particular outcomes of MDT meetings and coded connected data (e.g. diagnoses from scans). cancer_surgery For a proportion of cancer participants in the 100,000 Genomes Project, this table contains details of what surgical procedures were had, as well as the specific location of the intervention. cancer_risk_factor_general For a proportion of cancer participants in the 100,000 Genomes Project, this table contains data on general cancer risk factors, namely smoking status, height, weight and alcohol consumption. This table was compiled with input from GeCIP members. cancer_risk_factor_cancer_specific: For a proportion of cancer participants in the 100,000 Genomes Project, this table contains data on specific risk factors related to particular cancer types. This table was compiled with input from GeCIP members. cancer_invest_imaging: For a proportion of cancer participants in the 100,000 Genomes Project, this table contains: coded data on imaging investigations characterising the scan, its modality, anatomical site and outcome; as well as the outcome of the imaging report in free text form. Data derived from or relating to tumour samples: cancer_invest_sample_pathology: For a proportion of cancer participants in the 100,000 Genomes Project, this table contains full pathology reports and other related data on and from their tumour samples around diagnosis and characterisation of the cancer. Please note that much of this information is also found in the clinic_sample and cancer_participant_tumour tables. cancer_specific_pathology: For a proportion tumours from cancer participants in the 100,000 Genomes Project, this table contains pathology data specific to that participant’s cancer type. This may provide additional data to the cancer_invest_sample_pathology and cancer_participant_tumour tables. cancer_systemic_anti_cancer_therapy: For a proportion tumours from cancer participants in the 100,000 Genome
Is Part Of:
100K Primary Data

Coverage

Spatial:
UK
Typical Age Range:
0-150
Follow Up:
OTHER
Physical Sample Availability:
  • DNA
  • TISSUE
Pathway:
Linked datasets cover secondary care.

Provenance

Origin

Purposes:
  • CARE
  • STUDY
  • OTHER
Sources:
  • ELECTRONIC SURVEY
  • EPR
  • LIMS
  • OTHER
Collection Situations:
  • CLINIC
  • IN-PATIENTS
  • OUTPATIENTS

Temporal

Accrual Periodicity:
QUARTERLY
Distribution Release Date:
30 March 2023
Start Date:
01 January 2014
End Date:
01 January 2019
Time Lag:
2-6 MONTHS

Accessibility

Access

Access Service:
More information about the Genomics England Research Environment can be found here: https://www.genomicsengland.co.uk/about-genomics-england/research-environment/ https://research-help.genomicsengland.co.uk/display/GERE/1.+The+Genomics+England+Research+Environment Genomics England 100k participants have consented to longitudinal lifetime followup and recontact safely through our clinical network. BRST (Bioinformatics Research Services) are a team of bioinformatics who know the dataset inside out and provide consultancy projects on a case by case basis. Our network of clinical and medical experts can be made available on case by case basis. Researchers have the opportunity to work with our and access the GeCIP network who are a community of world-leading experts in specific cancers and rare diseases.
Access Request Cost:
Fees will be dependent on the type of access that is necessary. Raw data is not eligible for export. Summary-level data may be exported provided that it is approved through the Genomics England Airlock Process
Delivery Lead Time:
2-6 MONTHS
Jurisdictions:
GB-GBN
Data Controller:
GENOMICS ENGLAND
Data Processor:
GENOMICS ENGLAND

Usage

Data Use Limitations:
GENERAL RESEARCH USE
Data Use Requirements:
  • ETHICS APPROVAL REQUIRED
  • PROJECT SPECIFIC RESTRICTIONS
  • PUBLICATION MORATORIUM
Resource Creators:
  • The 100
  • 000 Genomes Project Protocol v3
  • Genomics England. doi:10.6084/m9.figshare.4530893.v3. 2017. Publications that use the Genomics England Database should include an author as: Genomics England Research Consortium. Please see publication policy.

Format and Standards

Vocabulary Encoding Schemes:
  • HPO
  • ICD10
  • NHS NATIONAL CODES
  • ODS
  • OPCS4
  • READ
  • SNOMED CT
  • OTHER
Languages:
en
Formats:
Multiple Formats Available

Enrichment and Linkage

Qualified Relations:
  • HES Accident and Emergency
  • HES Admitted Patient Care
  • HES Outpatient Care
  • Diagnostic Imaging Dataset (DID)
  • Systemic Anti-Cancer Therapy Data Set (SACT)
  • National Radiotherapy Dataset (RTDS)
  • Cancer Registration (AV) tables
  • Cancer waiting times (CWT)
  • Lung Cancer Data Audit (LUCADA)
  • PHE Diagnostic Imaging Dataset (NCRAS_DID)
  • Patient Reported Outcome Measures (PROMs)
  • Mental Health Minimum Data Set (MHMDS)
Derivations:
Not Known

Observations

Statistical Population
Population Description
Population Size
Measured Property
Observation Date
Findings
Rare Disease - Number of genomes
73,517
Count
30 March 2023
Findings
Cancer Germline - Number of genomes
32,753
Count
30 March 2023
Findings
Cancer Tumour - Number of genomes
17,003
Count
30 March 2023
Persons
Cancer Participants
15,624
Count
30 March 2023
Persons
Rare Disease Participants
72,874
Count
30 March 2023