Galaxy | Published Page | Create GTrack file from unstructured tabular data

None

Published Pages | hb-superuser | Create GTrack file from unstructured tabular data

Create GTrack file from unstructured tabular data

This tool allows structuring unformatted tabular data by specifying the necessary meta-data through simple selection boxes, inferring further properties of the data where possible.

To do this the user must select the column names for the table, enabling the GTrack header expander to automatically expand the headers, effectively converting the file to a GTrack file.

In this example the raw data of Human Papilloma Virus (HPV) used for creating GTrack file was generated and published in Kraus, I et.al. The dataset is available from this link: http://hyperbrowser.uio.no/test/static/hyperbrowser/files/tutorial/HPV_sites.xls.

The document consist of Sample ID, Chromosome ID, Strand ( "+"or "-") and Coordinates.

Part 1- Create GTrack file

In the "HyperBrowser track processing" tool menu, select "Format and convert track"
Select "Create GTrack file from unstructured tabular data"
Copy the raw dataset from the excel document and past it in the "Type or paste in tabular file:" box
Select "Tab" as "Character to use to split lines into columns"
Select "1" as "Number of lines to skip (from front)"
- This is done to remove the header row from the tabular file

To specify the column names in the table, you first have to:

Select "Select individual columns" as "Column selection method"
Select "--custom--" as "Select the name for column #1"
In the "Type in a custom name for column #1" write "sample"
- This column is not a reserved GTrack column, but rather a custom column that can contain any text
Select "seqid" in the "Select the name for column #2"
- This column is the sequence ID, in this case the chromosome IDs
Select "strand" as "Select the name for column #3"
Select "start" as "Select the name for column #4"
Notice that the field "Current track type" is updated dynamically according to the columns that has been selected, using the mapping defined in the GTrack specification document (see the tool "GTrack specification" under "GTrack tools"
Choose "Yes" for "Select a specific genome?"
Select "Human Mar. 2006 (hg18/NCBI36) as "Genome build"
Select "1-indexed, end inclusive" as "Indexing standard used for start and end coordinates"
- This is usually the case for data copied from unspecified tabular data
Select "Yes, auto-correct to the best match in the genome build"
Press the "Execute" button
Click the eye icon to see the result data

Part 2- Do the analysis

To see where in the Human genome the HPV is localized, we will do a HyperBrowser analysis. This is done by first:

Enlarging the "History" element by clicking it's name and clicking the "perform HyperBrowser analysis" button
Select "Human Mar. 2006 (hg18/NCBI36) as "Genome build", if not already selected
In the "First track" box, select "From history" and then the newly created GTrack file
In the "Second track", select "Genes and gene subsets" as the first level, then "Flanks" as the second level and finally "refseq exons upstream 1kb"
In the "Analysis" box, select "Hypothesis testing" as "Category" and the question "Located inside?"
Click the "Start analysis" button

You may import the history by clicking the "import history" button below. You will see a overview of the files and parameter settings in the tools.

Galaxy History | Create GTrack file from unstructured tabular data

References

Kraus, I., Driesch, C., Vinokurova, S., Hovig, E., Schneider, A., Knebel Doeberitz, von, M., & Durst, M. (2008). The Majority of Viral-Cellular Fusion Transcripts in Cervical Carcinomas Cotranscribe Cellular Sequences of Known or Predicted Genes. Cancer Research, 68(7), 2514â2522. doi:10.1158/0008-5472.CAN-07-2776

About this Page

Author

hb-superuser

All published pages
Published pages by hb-superuser

Rating

Community
(0 ratings, 0.0 average)

Create GTrack file from unstructured tabular data

Galaxy History | Create GTrack file from unstructured tabular data

References

Author

Related Pages

Rating

Tags