METADATA
Project
field_name |
description |
example |
notes |
data_format |
|---|---|---|---|---|
project_id |
Short unique project identifier or acronym |
ocular-microbiome |
required |
string |
project_name |
Full name of the research project |
Ocular Microbiome Project |
required |
string |
project_goal |
Brief summary of project aim or objective |
Investigate the microbial composition on the surface of the human eye. |
required |
string |
project_type |
Main methodological approach (e.g., metagenomic, genomic,RNA-Seq) |
Metagenomic |
required |
string |
project_design |
Description of project design, experiment type or sampling scheme |
Cross-sectional observational study of healthy and diseased eyes |
recommended |
string |
bioproject_accession |
NCBI BioProject accession number (if available) |
PRJNA123456 |
recommended |
string |
contact_name |
Project coordinator, PI, or submitter |
Lisa A. Neuhold |
required |
string |
contact_email |
Email address of coordinator/submitter |
required |
string |
|
library_count |
Number of libraries |
10 |
recommended |
numeric |
Center
field_name |
description |
example |
notes |
data_format |
|---|---|---|---|---|
center_id |
Unique code or acronym for the center |
JHU |
required |
string |
center_name |
Full name of the parent center, lab, or institution |
Johns Hopkins University |
required |
string |
contact_name |
Name of submitting PI or contact person |
Laura Ensign |
required |
string |
contact_email |
Primary contact’s email |
required |
string |
Individual
field_name |
description |
example |
notes |
data_format |
|---|---|---|---|---|
individual_id |
Unique identifier for the individual/subject |
OMI00001 |
required |
string |
individual_alias |
Internal study label or code |
52 |
recommended |
string |
age |
Age at time of sampling (years) |
35 |
recommended |
numeric |
birth_year |
Year of birth (YYYY) |
1990 |
recommended |
numeric |
consent_code |
IRB or consent identifier |
IRB2024-001 |
required |
string |
disease_status |
Overall health status or diagnosis |
healthy |
recommended |
string |
ethnicity |
0=Non Hispanic, 1=Hispanic |
0 |
recommended |
categorical |
gender |
Self-identified gender |
Female |
recommended |
string |
has_autoimmune_disease |
Autoimmune disease presence |
1 |
0=No, 1=Yes |
binary |
has_diabetes |
Diabetes mellitus status |
0 |
0=No, 1=Yes |
binary |
has_hypercholesterolemia |
Hypercholesterolemia status |
1 |
0=No, 1=Yes |
binary |
has_ocular_allergies |
Ocular allergy presence |
0 |
0=No, 1=Yes |
binary |
has_ocular_trauma_history |
History of ocular trauma |
0 |
0=No, 1=Yes |
binary |
has_pets |
Household pet exposure |
cats and dogs |
recommended |
string |
has_refractive_error |
Presence of refractive error |
1 |
0=No, 1=Yes |
binary |
has_refractive_surgery_history |
History of refractive surgery |
0 |
0=No, 1=Yes |
binary |
has_rosacea |
Rosacea diagnosis |
1 |
0=No, 1=Yes |
binary |
has_systemic_allergies |
Systemic allergy presence |
0 |
0=No, 1=Yes |
binary |
longitudinal_sampling |
Repeated sampling indicator (yes/no) |
yes |
recommended |
string |
race |
0=White, 1=Black, 2= Asian 3=Other |
0 |
recommended |
categorical |
sex |
Biological sex at birth (M, F, Other, Unknown) |
M |
required |
string |
smoking_status |
Smoking status category |
current |
0=never, 1=former, 2=current |
categorical |
uses_anti_depressants |
Antidepressant medication use |
0 |
0=No, 1=Yes |
binary |
uses_anti_epileptics |
Anti-epileptic medication use |
0 |
0=No, 1=Yes |
binary |
uses_anti_hypertensives |
Antihypertensive medication use |
1 |
0=No, 1=Yes |
binary |
uses_asa |
Aspirin (ASA) use |
0 |
0=No, 1=Yes |
binary |
uses_antibiotics |
Antibiotic exposure (systemic or topical) |
penicillin |
drug name or binary depending on encoding |
string/binary |
uses_antihistamines |
Antihistamine use |
0 |
0=No, 1=Yes |
binary |
uses_artificial_tears |
Artificial tear use |
1 |
0=No, 1=Yes |
binary |
uses_contact_lenses |
Contact lens use status |
1 |
0=No, 1=Yes or lens type if available |
string/binary |
uses_eye_mask |
Eye mask use |
0 |
0=No, 1=Yes |
binary |
uses_fish_oil |
Fish oil supplementation |
0 |
0=No, 1=Yes |
binary |
uses_lid_wipes |
Lid hygiene wipe use |
1 |
0=No, 1=Yes |
binary |
uses_makeup_routinely |
Routine makeup use |
0 |
0=No, 1=Yes |
binary |
uses_multivitamins |
Multivitamin use |
1 |
0=No, 1=Yes |
binary |
uses_nsaids |
NSAID use |
0 |
0=No, 1=Yes |
binary |
uses_probiotics |
Probiotic supplementation |
1 |
0=No, 1=Yes |
binary |
uses_statins |
Statin medication use |
0 |
0=No, 1=Yes |
binary |
other_medication_or_condition |
Other reported condition or medication |
NA |
optional |
string |
BioSample
field_name |
description |
example |
notes |
data_format |
|---|---|---|---|---|
biosample_id |
Unique identifier for the sample |
OMS00001 |
required |
string |
biosample_alias |
Alternate code or name used at collection site |
S89 |
recommended* |
string |
project_id |
Short unique project identifier or acronym |
ocular-microbiome |
required |
string |
individual_id |
Unique subject/individual identifier (linked to subject table) |
OMI000123 |
recommended* |
string |
center_id |
Unique identifier for associated (collection or extraction) center |
JHU |
recommended |
string |
visit_number |
Number/label of visit or collection event |
1 |
recommended |
numeric |
collection_date |
Date of sample collection (ISO 8601) |
2024-05-07 |
required |
date |
biosample_type |
Anatomical or environmental site type |
Microbial Community Standard,Human Ocular Surface,OMR-110,Air Control |
required* |
string |
biosample_site |
Anatomical site of origin |
Skin,Lid,Conj |
required |
string |
biosample_material |
Material type of sample |
swab, tissue, tear |
required |
string |
biosample_side |
Laterality of sample |
left right, NA |
recommended* |
string |
collection_method |
Method used for sample collection |
swab, wash, filter paper |
recommended |
string |
collection_time |
AM/PM |
5pm |
string |
|
collection_medium |
Fluid in which the biosample is stored |
ANE(Amies Transport Medium),BSS(Balanced Salt Solution) |
recommended |
string |
collection_device_lot |
Collection device batch |
BG001 |
optional |
string |
collection_device_id |
Device identifier or lot number for collection device |
DEV001-2023 |
optional |
string |
swab_type |
Brand/model/type of swab used for sample collection |
FloQ,Isohelix,Puritan |
optional |
string |
swab_kit_lot |
Manufacturer and lot for swab/kit |
Qiagen-12345 |
optional |
string |
swab_lot_number |
Lot number of the swab used |
A12345 |
optional |
string |
storage_duration |
Time from collection to processing (ISO 8601) |
PT2H30M |
optional |
string |
storage_temperature |
Sample storage temperature after collection (with units) |
-80 °C |
optional |
string |
preservation_method |
Method or reagent used for preservation |
RNAlater, ethanol |
optional |
string |
stabilizing_fluid |
Type of fluid used to stabilize the collected sample |
OMR |
optional* |
string |
stabilizing_fluid_lot |
Lot number of stabilizing fluid used |
BD801 |
optional* |
string |
microbial_standard |
Indicates if a microbial community standard was used |
MCS |
optional* |
string |
replicate_type |
Replicate type |
biological |
optional |
string |
batch_number |
Identifier for batch |
Batch5 |
optional* |
string |
plate_number |
Identifier for plate |
Plate2 |
optional* |
string |
plate_location |
Identifier for plate location |
B3 |
optional* |
string |
lysozyme_treatment |
Indicates if lysozyme treatment was applied during processing |
yes |
optional* |
bool |
host_depletion_performed |
Indicates host depletion step performed |
yes, no |
recommended |
bool |
host_depletion_method |
Specific host depletion method applied |
MolYsis Basic5, NEBNext |
recommended* |
string |
host_depletion_kit |
Kit(s) used for host depletion |
NEB, Molzym, HostZero |
optional* |
string |
extraction_date |
Date of DNA/RNA extraction (ISO 8601) |
2024-05-08 |
recommended |
date |
extraction_kit |
Brand/model of DNA extraction kit |
Qiagen DNeasy Blood_Tissue Kit,Zymo Micro Prep Kit,Qiagen PowerSoil Pro Kits,MasterPure Gram Positive DNA Purification Kit |
recommended* |
string |
extraction_kit_lot |
Lot number for DNA extraction kit |
L20240155 |
optional |
string |
extraction_protocol_version |
Version or code/identifier of extraction protocol |
v1.2 |
optional |
string |
extraction_protocol_modifications |
Details of any protocol modifications |
Increased elution volume to 100 µl |
optional |
string |
sample_volume |
Total volume or mass collected (with units) |
200 µl |
recommended* |
string |
elution_volume |
Final elution volume (with units) |
50 µl |
optional |
string |
nucleic_acid_concentration |
DNA/RNA concentration (with units) |
20 ng/µl |
recommended* |
string |
dna_yield |
Measured DNA quantity (with units) |
100 ng |
optional |
string |
dna_qc_metrics |
DNA quality control metrics |
260/280 ratio 1.85, Qubit 22 ng/µl |
optional |
string |
biosample_accession |
External accession IDs (BioSample SRA, ENA) |
SAMN12345678 |
recommended |
string |
geo_loc_name |
Geographic location where the sample was collected |
USA: Maryland: Baltimore |
Should include country, state/province, and city if known; standardized names recommended |
string |
Library
field_name |
description |
example |
notes |
data_format |
|---|---|---|---|---|
library_id |
Unique experiment identifier |
OMX00001 |
required |
string |
library_title |
Title or brief name of experiment |
Ocular Microbiome RNA-Seq Batch 1 |
recommended |
string |
library_description |
Short summary of experiment purpose/design |
RNA-seq to profile eye surface samples |
recommended |
string |
biosample_id |
Related sample ID |
OMS00001 |
required |
string |
library_name |
Name/identifier of the prepared sequencing library |
LIB001-A |
required |
string |
library_strategy |
Sequencing strategy (controlled vocabulary) |
rna-seq |
required |
string |
library_source |
Source material for library |
mRNA |
recommended |
string |
library_selection |
Method of nucleic acid selection/enrichment |
polyA selection |
recommended |
string |
library_layout |
Read layout (single/paired-end) |
paired-end |
required |
string |
library_prep_kit |
Kit used to prepare sequencing library |
NEBNext Ultra II RNA |
recommended |
string |
library_prep_kit_lot |
Lot number of library prep kit |
RN2023-02354 |
recommended |
string |
library_prep_protocol_version |
Version or code for the library prep protocol |
v2.0 |
recommended |
string |
library_prep_date |
Date of library preparation (ISO 8601) |
2024-05-09 |
recommended |
date |
index_barcodes |
Index/barcodes used for multiplexing, comma-separated |
CGATCAGG,ACTGTGTA |
recommended |
string |
sequencing_platform |
Sequencing platform manufacturer/model |
illumina |
required |
string |
sequencer_model |
Sequencer instrument model |
novaseq |
required |
string |
sequencing_date |
Date sequencing run performed (ISO 8601) |
2024-05-12 |
required |
date |
sample_count |
Number of samples in the library |
10 |
recommended |
numeric |
design_description |
Additional details or rationale for experiment design |
Comparative RNA-Seq of left/right eye samples over 3 timepoints |
recommended |
string |
Sample
field_name |
description |
example |
notes |
data_format |
|---|---|---|---|---|
omc_id |
Unique identifier for the sequencing run |
OMR00001 (jhu00001) |
required |
string |
sample_id |
Unique center identifier for the sequencing run |
Molzym_Ultra_Deep_Air_Ctr_3 |
required |
string |
sample_accession |
Public database run accession (e.g. NCBI SRA) if deposited |
SRR00001 |
required |
string |
library_id |
Associated experiment identifier linking to the experiment table |
OMX00001 |
required |
string |
sample_name |
Short title or name for the run |
Ocular_Run01_2024-05-12 |
recommended |
string |
sample_description |
Summary or purpose of the sequencing run |
Non-human read run for sample OMS00001 |
recommended |
string |
avg_spot_length |
Average read length (specify units) |
150 bp |
recommended |
numeric |
adapter_sequences |
Adapter sequences used |
AGATCGGAAGAGC |
recommended |
string |
original_file_id |
Original raw read file name or ID |
OMS00001_R1.fastq |
recommended |
string |
total_reads |
Total number of read pairs in the original file |
45000000 |
required |
numeric |
total_bases |
Total number of bases in the original file |
4500000000 |
required |
numeric |
unmapped_reads |
Total number of reads not aligned to the reference(s) |
required |
numeric |
|
original_read_file_size |
File size of original read file (e.g. MB/ GB) |
10 GB |
recommended |
string |
original_read_file_checksum |
Checksum (e.g. MD5 or SHA256) of original read file |
d41d8cd98f00b204e9800998ecf8427e |
recommended |
string |
non_human_read_file_id |
Processed non-human read file name or ID |
OMS00001_R1_nonhuman.fastq |
optional |
string |
non_human_reads |
Number of non-human read pairs |
4500000 |
optional |
numeric |
non_human_read_file_size |
File size of non-human read file |
1 GB |
optional |
string |
non_human_read_file_checksum |
Checksum of non-human read file |
e339a515c5ed4f561237b1799335c30b |
optional |
string |
analysis_date |
Date the sample was analyzed (ISO 8601) |
2024-05-12 |
required |
date |
reference_genome |
Reference genome version(s) used for human read detection |
hs38DH, chm13v2.0 |
optional |
string |
microbial_database |
Microbial database(s) used for classification |
krakendb-2020-08-16-all_plus_eupath |
optional |
string |
human_read_detection_software |
Software (name and version) used for human read detection |
minimap2 2.30-r1287 |
optional |
string |
non_human_read_classification_software |
Software (name and version) used for non-human read classification |
KrakenUniq 1.0.2 |
optional |
string |
base_quality_metrics |
Sequencing base quality metrics |
90% Q30, Q20 |
optional |
string |
data_processing_pipeline |
URL or name of processing pipeline (include version/date) |
optional |
string |
|
sequencing_control |
Sequencing run control(s) used (e.g. spike-ins) |
PhiX spike-in 1% |
optional |
string |
release_status |
Data release status |
public |
optional |
string |
release_date |
Date of public or controlled release (ISO 8601) |
2024-06-30 |
optional |
date |
sequencing_lane |
Illumina sequencing lane |
L001 |
optional |
string |
Analysis
field_name |
description |
example |
notes |
data_format |
|---|---|---|---|---|
sample_id |
Unique identifier for the sequencing run |
OMR00001 (jhu00001) |
string |
|
taxonomy_name |
The scientific classification name of an organism |
Cutibacterium acnes |
string |
|
abbreviation |
The shortened form or taxonomy_name |
|
string |
|
taxonomy_id |
A unique identifier that corresponds to a specific taxonomic classification within NCBI |
1747 |
numeric |
|
taxonomy_lvl |
The rank or level of classification within the taxonomic hierarchy, such as kingdom, phylum, or species. |
S (species) |
string |
|
assigned_read |
The number of sequencing reads assigned unambiguously to this specific taxonomic group by the non_human_read_classification_software |
10000 |
numeric |
|
added_reads |
The number of additional reads assigned to this specific taxonomic group |
1000 |
numeric |
|
microbial_reads |
assigned_reads+added_reads assigned to this specific taxonomic group |
11000 |
numeric |
|
total_microbial_read |
sum of total_microbial_read for all species in this sample |
10000000 |
numeric |
|
fraction_microbial_reads |
(assigned_reads+added_reads)/total_microbial_reads |
0.1 |
numeric |
|
avg_fraction_microbial_reads |
average fraction_microbial_reads assigned to this specific taxonomic group for multiple samples |
0.15 |
numeric |