For this post I will be reflecting upon my team's progress thus far and chapter 9 of Teaching Open Source.
Much to my surprise, chapter 9 of Teaching Open Source is merely a one page chapter talking about how the oft-espoused "Release Early, Release Often" motto in FOSS should be applied to more than just software for FOSS projects. The textbook itself is actually released early and released often, as it is an experimental, open source textbook. The book even has its own mailing list, like the one I've mentioned for Galaxy a multitude of times in the past.
So let's get into my team's work on Galaxy. So last time, I left off with a code segment that I thought could be improved upon in order to become more efficient than the Python map(...) function which I originally used to create the Transpose tool for Galaxy. Jake and Jacob, of Team Rocket, took the lead after I set up the framework (with the functional testing, map function, and xml markup). The code below is what they came up with.
So the logic behind this is pretty simplistic. Effectively, we want anyone that uses Galaxy to be able to transpose tab-delimited data (if the data is not tab-delimited, then Galaxy is able to provide conversion tools). A problem that often occurs is that someone is using lots of genetic data that is just way too big to be all loaded into memory at one time. So we take advantage of this by using for-each loops. Once python has made a pass through a for-each loop, where the for-each here is "for each line in an inputfile", then python no longer needs to access previous lines. We need to confirm this with documentation but the built-in garbage collector for python should scrap the previous lines since the lines are not stored into variables of any sort.
On my end, I have just been keeping up with syncing our team's forked repository of the galaxy-central master branch, that way when we want to push some changes in we will be able to without any hassle.
Music listened to while blogging: Schoolboy Q
No comments:
Post a Comment