For a quick summary, here is the link to the pull request: https://bitbucket.org/galaxy/galaxy-central/pull-request/335/added-transpose-tool/diff
So let's go over the files mentioned in the diff of the pull request and see what is actually going on:
tools/filters/transpose.py
Now the file itself has a lot of bulkiness located in it. This is for handling Galaxy's standards for error handling and function calling. They prefer the creation of a main method and the calling of it through an "if __name__" call.
The code located on the left is the actual conduction of the data transposing. The files are assumed to be tab-delimited (we could even fix this "bug" later by improving the tool). The user need only input a file through Galaxy's interface and this tool can be ran on the data.
Now, you may be wondering "What if someone does not use tab-delimited data?" or "How do people know that the data is supposed to be tab-delimited?"
This is all answered in the XML file:
tools/filters/transpose.xml
Now the entirety of the XML file is important for feature addition in Galaxy.
Line 1 of this file specifies the tool id (just a unique identifier - does not get referenced anywhere else), the name of the tool (a name you want users to recognize the tool by in Galaxy's interface), and a version for the tool (since it is new, 1.0.0). This line is, finally, closed on the last line of the file by simple XML markups.
Line 2 of the XML calls for a description of the tool. This is appended (with a preceding space) to the tool name in Galaxy's interface to give an extremely brief description of what the tool does and from where. So here we just say you can transpose data from a file (as opposed to the inputting of data, manually).
The next few lines (not required to be a specified length in Galaxy's specifications) call for the command interpretation and the actual command line call Galaxy will be making. Galaxy supports all kinds of interpreters for scripting (perl and python are the only ones that come to mind). So here, since we are using python, we use "python" as the interpreter argument and then enclose our command. The first argument of the command is the python file itself. After this we have identifiers (signified by $) to inputs later specified in the XML markup - input and output, which just reference files.
So let's talk about those files since they are in the next 2 sections (inputs and outputs). We have one input. The arguments utilized here are "format", "name", "type", and "label". The name identifier is what references back to the command line call. Format is an optional argument specified for the type="data" that restricts users from using inappropriate arguments. So the tool, as it stands, only works on tabular (or tab-delimited) data. Lastly, there is a label argument. In the gui representation of the tool, the label will precede the placement of the argument.
Lastly, Galaxy has a help markup for their XML files. The first specification within this help section is a reference to a tool within Galaxy that can convert data to being tab-delimited. Essentially, this generalizes the tool by allowing any form of delimited data to be used as data can be converted to tab-delimited and then converted back. While tedious, users can create workflows that will conduct this task for them, if so desired. Next in the help section, there is an example. Just in case a user is unsure of what transposing actually does to their data, there is a simple markup that shows a before/after transposition on a small piece of data.
tool_conf.xml.main

test-data/transpose_in1.tabular
Now here is where we provide information about those test tags in the XML that you may have noticed that I skipped talking about earlier. Galaxy has a built-in function that mines the XML files for running functional and unit tests - effective for making sure crazy bugs are not
test-data/transpose_out1.tabular

And that's the story of my second pull request to Galaxy.
Music listened to while blogging: Ellie Goulding
No comments:
Post a Comment