ONETOOL supports many different types of data in many different file formats. For a run, however, it only requires the input file(s) that are relevant to the specified analyses in that run.
Input Data Types¶
Though the main purpose of ONETOOL is for family-based big data analyses, it can analyze unrelated individual data as well.
Types of data supported:
Sample
- Family samples (related individuals)
- Independent samples (unrelated individuals)
Phenotype
- Binary
- Continuous
Variant
- genotype - common and rare SNPs
- genotype probability/dosage - imputed variants
Note
The terms ‘sample’, ‘subject’ and ‘individual’ are used interchangeably.
Note
The terms ‘variant’, ‘SNP’ and ‘genotype’ are used interchangeably.
Note
The terms ‘trait’ and ‘phenotype’ are used interchangeably.
Input File Types¶
The types of input file (with the expected extension in parenthesis) that can be used for an ONETOOL run are listed for different data types.
Data Type | File Type | Extension |
---|---|---|
sample | PLINK FAM file | .fam |
phenotype | PLINK Phenotype file | .pheno |
variant | Variant Call Format (VCF) file | .vcf |
variant | PLINK BED/BIM file | .bed/.bim |
variant (dosage) | IMPUTE2 file | .impute2/.impute2_info |
sample + variant | PLINK PED/MAP file | .ped/.map |
Additinal files for a specific analysis:
Two main input file sets are ‘VCF set’ and ‘PLINK set’.
The VCF set consist of a PLINK format family file (.fam) and a Variant Call Format file(.vcf).
The PLINK set consists of three files (i.e., .fam, .bed, and .bim) that are used to run PLINK.
The additional phenotypes and covariates are supported through an optional input file (.pheno) for both sets of input files.
SCRIPT file¶
ONETOOL also support two different ways to run the program, through a command line and a script file (.script).
A script file includes the input file name(s) and all command-line options selected for a ONETOOL run.
$ onetool --script test.txt