Clayton Turner: An Introductory Guide: January 2014

Wednesday, January 29, 2014

Free Civ Building and Team Update

This post is coming in at an awkward time. It is currently snowing in Charleston, SC - a rare sight. So rare that the computing resources on campus for the computer science department decide to go out at the same time (not sure if it's related since there have been emails about transformer issues). Regardless, it has been a frustrating couple of days because my personal research is MIA because of this transformer issue and it is impossible to check assignments on the CS sites and I am having to rely on word-of-mouth from fellow students to figure out what I should be doing outside of the obvious.

So for this post I will be reflecting upon chapter 5 of Teaching Open Source as well as giving a short update on my team's work.

Teaching Open Source: Chapter 5
Again, this chapter starts from an early knowledge stage when it comes to open source software. For example, the chapter explains how a large codebase can be manually built, tested, and used to form an executable - an important concept. The step, I feel, that most people have issues with when it comes to building open source projects is that dependencies seemingly always arise. Knowing how to download these dependencies or even check for them is what becomes the most important skillset at that time. And this changes based upon languages. For example, with C, you can check makefiles for dependencies that may be missing, though not always the case. Also, the go-to solution should be checking the README for a project or checking their online documentation for building as open source projects should have installation instructions flushed out. If you still can't build software, then you still should not be hopeless. You still can Google solutions - if the project is popular enough, then you may stumble across a publicly available mailing list where someone has asked the same, or a similar, question. Say that your problem didn't come up... then you can be the one to ask that question on a mailing list (or IRC) and potentially save someone else griping later on down the road, all while fixing your own problems.

So I decided to finally install freeciv (a freeware version of Sid Meier's Civilization 2). There were a large amount of dependencies that had to be installed. The process of installing a dependency then checking what else was needed was pretty tedious. I'm curious as to whether or not there would be a way to run a configuration file that automatically found missing dependencies - I suppose that would pose a problem considering all the different operating systems and how they name packages differently from time to time. Eventually, I was able to get the configuration file to run to completion - at the end it even said "now type make" to finish the installation. I did this since the freeciv information in the Teaching Open Source book is a bit out of date. I used sudo apt-get commands where Teaching Open Source used the yum command a lot more.

Team Update
We have selected a bug, as noted last blog. We are writing up an experience report on our wiki, located here. The experience report entails what Galaxy is, what Galaxy is used for, the history of Galaxy and its members, the bug we have selected, and it will soon include a timeline of events for our group for the rest of the semester. We plan on fixing a few bugs and implementing a feature, as Jake and I have all the knowledge to add features to Galaxy with relative ease.

Music listened to while blogging: Childish Gambino

Monday, January 27, 2014

Subversion Under Control

There are a few things that will be tackled and reflected upon for this post. Version control for my open source project, bug selection, project building, and reflection upon Chapter 4 of Teaching Open Source.

VERSION CONTROL

First, version control. Instead of just using Subversion, like my last team project, we will be handling the usage of subversion and git, putting me under the assumption that we can use whichever we prefer - for me this will easily be git. I had a blog post in the past where I compared and contrasted subversion and git, but I will touch up on some of the highlights now as a refresher. A subversion reference can be found here and a git reference similar to the one I used whenever I learned git can be found here. Git and subversion have the same goal: version control, but that have different feels and these different aesthetics really manifest themselves in which gui you use - granted this is circumvented by just using the command line.

The terminology in git is a lot easier for me to grasp since it is the first version control system to which I became accustomed. In subversion, a "commit" is the final step in having your changes exposed to everyone else on your project. In git, a "commit" is the penultimate step with a "push" being what exposes your changes to your partners. This allows commits to be sent out in batches or changes be made at the last second. It really is not a big difference, just more of a preference.

BUG SELECTION
Most of the bugs in Galaxy are pretty abstract and have multiple people working on them. We, as a team, wanted to find a bug that was non-critical so that we would not end up working on a bug that gets fixed quickly by someone else because they have a head start on us or because it is critical to their personal work. So after much perusal, we stumbled upon this bug on Galaxy's Trello page. Essentially, we just need a way for comments to be allowed for this specific tool and this can even be abstracted for use with other tools in Galaxy. When conducting this bug fix we will also be contributing documentation for how to use our fix, as it will have a user input component as that is required for Galaxy.

PROJECT BUILDING
Everyone in my group has been able to build Galaxy, as their build documentation is top-notch. A simple install of Mercurial and a command within Mercurial builds Galaxy and grabs all dependencies. The only outside dependency that has to be installed is Python and that is already bundled in to a lot of Linux builds by default, so that really is not even an outside dependency. Mercurial is its own version control system, but I think we will be just using subversion or git to manage our version control since we, as a group, are more familiar with that and we do not want to have to jump over hurdles just to get our files in order.

TEACHING OPEN SOURCE -> CHAPTER 4
The introduction is really a PSA as to why software developers should use version control. You do not want to have dozens of versions of a file - it's so hard to compare which one is the best one and make sure you didn't forget someone's revisions from an older version. Version control is great (and I'd suggest using a form of version control even for writing collaborative papers - dropbox usually is good for this as conflicts can be settled easily). Additionally, with version control, you can back up and check previous versions so you can essentially undo someone's edits if it turns out that their ideas turned out to be just crummy.

The chapter then jumps into the basics of subversion, explaining how to make changes (e.g. svn add, svn delete, svn copy, svn move), conduct updates (i.e. svn update), commit changes (i.e. svn commit), etc.

Music listened to while blogging: Krizz Kaliko & Tech N9ne

Wednesday, January 22, 2014

Joining the Project

The bulk of this post will be reflecting about my experiences when joining the Galaxy project and the rest will be reflections upon readings that will be elaborated upon whence the time comes.

The first thing I did when joining the Galaxy project was sign up for two mailing lists: Galaxy-Dev and Galaxy-Commit. There are other mailing lists that they have, like Galaxy-User and Galaxy-France. Galaxy-User is for users to submit questions and problems that arise when using the Galaxy software (so a Stack Overflow-ish schematic) and Galaxy-France is for French-speaking users only, it seems.

I've been receiving emails a few days relating to problems people have had when developing with Galaxy (such as meshing issues that don't make sense) and emails relating to features that people are working on or proposing. The community seems to be extremely active as I have a new email every 30 minutes or so from a user that is committing or asking about a change.

For example, there was a thread where a user was questioning whether or not numpy and scipy were included in the Galaxy suite or whether it can be incorporated. Swiftly, a developer responded with a quick description on how to add modules that require dependencies such as numpy and scipy. There is a flag that can be flipped that pretty much says "If you want to use this tool, you need to have this installed - and it can be installed through Galaxy like this." A link here shows how this is marked up in XML.

Quick refresher: Each tool within Galaxy has an XML markup and a scripted backend (typically python). Having the dependencies in XML will force users to recognize that more needs to happen or at least just let the user know, if they desire, that there are other dependencies being utilized.

Additionally, I am making an effort to join the Galaxy IRC. The Galaxy IRC is on the irc.freenode.net server and the channel name is #galaxyproject. So after perusing different pieces of irc software I ended up landing on using xchat. I was trying to download a different irc client, but I hit an installation issue with two different irc clients and it turned out that Mint came packaged with xchat already so I just hopped onto it. It was trivial to connect to freenode and then joining the #galaxyproject chatroom. After sitting in the chat for about 15 minutes there wasn't any activity but there were 24 other people in the chat so if I had a question, I'm sure there would have been a quick response.

This was all done after perusing this irc how-to site and this irc tutorial site.

Furthermore, this response is reflecting upon Chapter 3 of Teaching Open Source. This goes perfectly hand-in-hand with what I am currently doing in joining Galaxy. The first things brought up are the issues of community and getting accustomed and comfortable with said community. Right away with Galaxy, I saw that one of their email lists is entirely for French speakers. This does mean that the primary language of Galaxy developers is English, but there is probably going to be a strong divide - say a French developer is working on what I am working on, then there is probably a pretty slim chance I will know before it's too late. So that is something I will have to tackle.

The book makes a mention of the idea of "synthetic third culture." Really this means that the culmination of people from different cultures starts to dismantle the individual cultural identity of the users and, thus, a synthetic culture starts to arise, especially as those from more and more different cultures join the community. This makes sense as individualistic culture at this point is not really relevant to the others and new humor, phrases, etc. will arise with collaborators that communicate at a high volume. It could almost be described as a postulate because of how close it seems to common sense.

And next, this is broken down into "Qualities of a Community" - this is really just a "what makes up this synthetic culture?" section. So the qualities are:
Focus - What is the actual interest of the community? Not the broad goal, but the main effort of the software (i.e. the main focus)
Maturity and History - This almost speaks for itself - newer projects have newer, younger communities that will have to undergo different experiences in regard to development speed and releasing where older projects have older communities that may have plateaued with a stable product release and are really just working on maintenance (though this still doesn't have to be the case)
Type of Openness - How open is their source? Is there core software kept under wraps with open peripherals that allow for interaction (almost api-like)? How old is the latest source? Is it even possible for anyone and everyone to contribute?
Commercial Ties - Sponsors are important to consider because their funding can steer the direction of a project. Sponsors aid by giving funds, technology, legal protection, people, etc.
Subgroups - Is the community one fluid being or is it broken down into several, key subgroups?
Skills - A community's skills depend on the skills of the individuals in a community - self-explanatory.
Mentoring and Training - Some communities will train newer members and get them to a point where they can catch up and start contributing meaningfully.

The chapter further discusses communication in various forms:
Synchronous - Live concurrent
Includes instant messaging (and IRC) and forms of audio chats (like Mumble or Teamspeak).
Asynchronous - Non-simultaneous
Includes email, wikis, etc.

This segues perfectly into the section on Wikis. I have discussed my team's wiki in the past and it is being developed to explain to people what open source software we are working with, what we will be contributing, and what that really entails in the grand scheme of the open source project. The rest of the open source teaching chapter focused on topics I have already broken down when considering my own project (IRCs and mailing lists). In the following week I, and the rest of my group, will be checking out the bugtrackers within Galaxy - I bring this up because that is what the end of the third chapter refers to.

Music listened to while blogging: Childish Gambino

Monday, January 20, 2014

FOSS Experiences and Reflections

For this post, I'll be reflecting about my experience with the installation of a virtual machine, installing a piece of open source software, and my use of that software.

Initially, I downloaded Ubuntu 12 from the Ubuntu site and was going to plug it in to virtualbox. Unfortunately, my virtualbox application is corrupted and cannot uninstall correctly, but there is always vmware. So I used my Ubuntu .iso to start a new virtual machine in vmware. For some reason, vmware cannot detect the Ubuntu .iso and other people have had this issue with no resolution. Rather than sweat over a problem so minor, I just used an old Linux Mint image I still had residing on my computer.

The open source software I will be testing is RMH Homebase. So I downloaded the zip file containing all the information that I need to test RMH Homebase, or so I thought. Looking up more information in the README of RMH Homebase revealed that I would need to install PHP5 and MySQL, among other dependencies. So let's check out some commands:

% sudo apt-get install mysql-server mysql-client
% sudo apt-get install apache2
% sudo apt-get install php5
% sudo apt-get install phpmyadmin

MySQL is required to handle the backend database management. Apache is required to get ahold of a local server instance. PHP5 is required for front-end management. phpmyadmin allows synchronicity between php5 and apache2.

Installing Apache lets you use localhost for local server instances. (http://localhost)

I had to do some navigation and tests to figure out where my localhost files went. /var/www/ is the location to drop files. I created a file "test.php" which loads perfectly fine and gives a blank webpage, which is way better to see than a "We can't find that resource" sort of page. I reloaded the apache server a few times when doing this so I am not particularly certain as to whether or not that that is required to put new files in.

So next I had to move the RMH Homebase (rmh15) files to this /var/www/ directory.

% sudo cp -R ~/Desktop/rmh15 /var/www/

This command will recursively copy all the files from the source to my localhost directory.

When trying to finish the marriage between phpMyAdmin and apache, I stumbled upon a past student's blog from the College of Charleston and it is from there that I found and used this command:

% sudo ln -s /usr/share/phpmyadmin /var/www/phpmyadmin

Logging in to phpMyAdmin was tricky at first as the username is "root" instead of your actual user that you are logged in to, but it makes sense. It seems like I can navigate all the files with phpMyAdmin and view the databases manually or run a SQL query to view information.

I could log straight into the RMH database by navigating to localhost/rmh15/, as well. Following the readme, I was unable to run the dbInstall.php file as directed. The output received when running this is as follows:

Installing Tables...
connected...
database selected...

No database selecteddbWeeks added...
Could not create dbSchedules table: No database selected

So I'm not sure what the problem is, but I will investigate this further soon.

Lastly, I would like to close with remarks about the work on my team project. We have officially chosen Galaxy as our project and I have provided a snippet of our "Project Decision" page from our team wiki. The page uses Ohloh to detail the usage of different languages in open source projects, as well as some summarization points about the project-at-hand.

Music listened to while blogging: Lily Allen & Kanye West

Wednesday, January 15, 2014

My FOSS Preferences

This blog post is going to be a bit short because I just got off of my plane from Toronto and I'm a bit tired, but I need, for my group, to choose 3 open-source projects that I would like to participate in.

An issue with this is that I could not contact my group members effectively whenever I was in Canada (except Jake since he was there). We, as a group, decided that we want to work on Galaxy and I have described that software system in the past. So I will just pick out two other projects that I would not mind working with while providing reasons.

If, for some reason, we were not able to work on Galaxy, then I would really like to work on Weka. Weka is an informatics platform for performing intensive data analysis, something I am accustomed to as I have, along with Jake's cooperation, developed my own version of this, Learn2Mine, and the link to it can be found on the right. It would be enlightening to work on a project and contribute to one that is very similar to mine and Weka is written in Java so I know I would not have issues working with it and producing novel code.

Lastly, I would want to work on Firefox. The reason for this is that Albert, a group member, worked with Firefox last semester and he could take the lead, at least in the beginning, if we used Firefox for our project. I know the rest of the group, including me, could come up to speed rather quickly. It would be rough as there exists a learning curve since C++ isn't my forte and building Firefox and working with it requires knowledge of that language.

I would like to close this blog post with remarks about Software Development from Software Development: An Open Source Approach by Allen Tucker, Ralph Morelli, and Chamindra de Silva. The opening to this book is much like any other software engineering book in that it focuses on why open source is important to the computer science community, a topic I reflected upon a couple of blog posts back. Additionally, however, this book talks about pitfalls and common issues with software development. For example, the importance of team programming versus individual programming. It is important for a team to be up to speed about every aspect of a project, even if you do have designated experts for certain areas. Additionally, it is important to note the differences between top-down and bottom-up development because every team member should know how the problem is being approached because different skill sets and thought processes back different approaches. Really, though, if utilizing the right process to conduct development, the team should be fine. I have worked using Agile development in the past and it is my favorite way to undergo development as I have not had pitfalls with it as of yet. Other development forms are important to consider too: spiral, waterfall (the modified version), etc. In the end, Free and Open Source Software should embody the following qualities: 1) Low cost 2) Freely available 3) Global public good 4) IT public service 5) Political neutrality 6) Easily customizable 7) No vendor lock-in.

Tuesday, January 14, 2014

IC4E: The Toronto Experience

This blog is written over the course of a few days.

I'll prelude this by saying this is my first time being out of the country and it really shows.

I had to get up bright and early at 4:30 AM on Sunday morning in order to make a flight to get here to Toronto, CA (Canada, not California). This really resulted in me being dead tired and not doing too much on the first day of being here.

My second day in Toronto, we went to the Royal Ontario Museum and walked around the city for a while to become acquainted with it. In order to get there, we traveled by subway (my first time doing that). It's really interesting to see how different everything is when I'm not loafing around in just South Carolina. Advertisements for bottles of "pop" displayed in milliliters instead of fluid ounces, for example. Or even when I see a sign say "100 km/hr" is the speed limit. It's perplexing to me. I know what a meter is and, thusly, know what a kilometer is. But I really have no idea how 100 km/hr translates over to miles/hr. I could do the math relatively easily (about 62 mph, apparently), but in the spur of the moment, it's just... different. I'm consistently working on my presentation for the conference (I give a talk tomorrow afternoon) and, while I believe I am ready content-wise, I am a bit afraid I might run into some issues with timing. We're given 10-15 minutes for our presentation and a 3-5 minute Q&A. I have practiced a few times, really just hammering home the content and saying what I want to say, but I keep breaking 15 minutes. I will have this worked out before I present, but it is a little nerve-racking.

So now let's talk about the actual bulk of the conference. A lot of the presentations were really not that interesting to me since this conference is actually the culmination of three conferences, with the other two focusing on economics/marketing and business. These presentations really did not appeal to me, but I still sat through and learned some things from them. Some of the moderators for the conference sessions were not assertive enough when it came to time. We were supposed to have 10-15 minute presentations and I listened to quite a few 30-40 minute presentations. When it was time for me to present, I was a bit nervous, but as with any presentation, that nervousness dissipated in a matter of seconds after starting to talk. I felt that my presentation was going to work wonders because Jake, Dr Anderson, and I worked really diligently to get it to a really nice spot. Most presentations we watched were text-heavy and had entire paragraphs and abstracts on slides and a lot of the presenters just read straight off of their powerpoints. Our presentation had 3 or less bullet points per slide and were only topical; I really just utilized the images to make the points I wanted to make. I believe presentations like that should be supplemental to the actual "talking" that the presenter is doing. When I was first called up to present, I was introduced as "Professor Turner" which I found to be funny because, well, I'm just an undergraduate student. What makes this even better, however, is the fact that we were awarded "Best Paper" of the session we participated in directly after my presentation (Facebook Picture With Certificate). Overall, this experience was enlightening and a fantastic adventure. There are still presentations I will be attending throughout the day, but this is the last day of the conference. Our flight leaves at 7PM tomorrow so I will get some time to explore Canada even more.

Friday, January 10, 2014

Open Source Technologies

After a brief hiatus from my blog during the holidays, I have finally returned to regularly posting.

One of the main reasons for this frequent posting is for my Software Engineering Practicum (CSCI 462) class at the College of Charleston, but I will also be using this blog to update on my own personal research (my next post will be reflecting upon this for a bit as I am attending a conference).

The project that I completed during my last semester for software engineering is geared to be presented at the School of Sciences and Mathematics poster session in the upcoming months, but, for the time being, I have a new group to work with for the new semester.

My team:

Me
Jake Dierksheide
Jacob Song
Albert Nardonne

We talked and have decided that we would all like to work on the Galaxy project (just like my team from last semester did). I believe this semester will go over a lot easier because Jake and I have a lot of experience working with Galaxy as our own research utilizes a local instance of Galaxy that we customize and expose to the Web. Since we, as a group, want to work on Galaxy we came up with the team name "Team Rocket" - this name should not necessitate much explanation but we were all happy with it.

A few readings that I have done for Software Engineering encompass a few, relatively simple topics. For example, before class the other day I had no notion of what the idea of a "planet" was when it came to software, but it is really nothing more than an aggregator for blog feeds written in python. Also, I read the first two chapters of an open source textbook for teaching open source projects, explaining significance of open source, etc. and the book can be located here.

Some examples of things I gathered from this book so far: FOSS (free and open source) projects and contributing to them is a great way to gain experience as a developer (even if just contributing documentation), add to a resume, and become associated with an open source community, which is huge in this day-and-age. Also, version control is really at the heart of FOSS because it is what allows multiple developers to work on the same parts of projects without running into conflicts that could result in a waste of time that would trudge up the development process. Ultimately, these introductory chapters are talking about the application of software development techniques that I have gained throughout last semester (and time before that with my own workings), such as the notion that no code is bug-free. Additionally, the software development lifecycle is an important concept to keep in mind - code has to evolve or else be a slave to devolution that results in a crummy piece of software that could only, at best, become a piece of clunky legacy software.

My next blog will be coming to Blogger from Toronto, Canada as Jake Dierksheide and I (under the guidance of Dr. Paul Anderson) are presenting at the 2014 5th International Conference on E-Education, E-Business, E-Management, and E-Learning (IC4E 2014) and details about the conference can be located here. I will be presenting on Tuesday during the afternoon session.

Music listened to while blogging: Spotify radio (based off of Hopsin)

Clayton Turner: An Introductory Guide