Clayton Turner: An Introductory Guide: September 2013

Wednesday, September 25, 2013

General Work and Reflection

This post will be unlike a lot of the posts I've had in the past. I will be reflecting on work from most of my classes, rather than just Software Engineering. This is because there have not been any major things done to our Galaxy project since my previous blog posting.

For the Galaxy project, we presented our findings and results of our first deliverable to the class. This included an overview of the project, how we ran the built-in tests, and our experience. Galaxy has built-in python files that can be run in order to conduct tests. One example is "run_functional_tests.py". This runs all the functional tests (takes hours to run) built into Galaxy. Optionally, you can specify a parameter that allows you to run a specific subset of functional tests. We will definitely be doing that whenever going through our tests cases because it would be too much of a hassle to have to wait 6 hours for each time we want to run through our testing suite.

In some of my other classes, we have been working on various projects. In Bioinformatics, I am currently working on a partnered assignment where we have to find the longest common subsequence in two strands of DNA. It's odd because our code, in Python, works on small versions of the problem and it works for the examples given, but our code does not produce the correct answer when we have to submit it for a larger problem. It's very unlikely that the site we are submitting to has errors, but I'm starting to question it at this point. In Advanced Algorithms, we just received back tests whose main subject was computational complexity of various algorithms and applied analysis of other algorithms; we looked at topics such as searching and sorting algorithms, big O, big Θ, and big Ω, recurrence relations, and other topics. In Programming Language Concepts, we have been studying regular expressions, grammars, compilers (every part: Scanner -> Parser -> ...), C, and pointers. In Public Speaking, I am currently preparing a speech that has the objective of informing the audience about the marvels of AI and how it will affect everyone in the future. Aside from that, work on my Bachelor's Essay has not gone into full swing yet as we are still in a heavy development phase for Learn2Mine. We are in the process of beefing up the security for the site as there is an exploit that we found that we are vigorously working to patch up.

Monday, September 23, 2013

First Deliverable Experience (Galaxy)

Working on Galaxy always turns out to be an interesting experience. Getting it installed on my virtual machine was pretty seamless once I figured out which dependencies were needed. Python was pre-installed with Linux Mint 15 so I did not need to worry about getting a compatible version of Python.

Unfortunately, we, my team and I, do not have access to our team SVN repository yet so we have been working in separate areas, still checking out files through SVN. I used my personal directory mentioned in an earlier blog post (./playground/Turner/galaxy/galaxy-dist/...)*.

Mercurial was needed to help initialize the Galaxy files. This was initialized very simply using the command:

sudo apt-get install mercurial

Mercurial is able to be called using the "hg" command (hg is the chemical symbol for Mercury - nifty little easter egg).

Once Mercurial is installed, you can run Galaxy. The source code does not need Mercurial to be extracted (or built), but it needs it for actually running the program itself. To initialize Galaxy, you navigate inside of the /galaxy-dist/ folder and use the command:

sh run.sh

This runs a bash script on the local machine (hence the "sh" for shell) and there exists a file within the /galaxy-dist/ folder named "run.sh". This file contains bash commands that start the Galaxy server.

This initiates Galaxy and it can then be redirected to by typing "localhost:8080" into a browser. Perhaps you have something else using your 8080 port, though. In order to modify the port on which Galaxy runs you would go into the "universe_wsgi.ini" file that exists in the /galaxy-dist/ folder. There are lines in this file that read as follows:

# The port on which to listen.
port = 8080

If you change this to 8081, for example, then you would navigate to "localhost:8081" in order to find the Galaxy interface. In order to push these changes into Galaxy, you have to stop the Galaxy instance from running (stopping the sh run.sh command). This can be done by using a keyboard interrupt (ctrl+c typically) in the terminal that started Galaxy. If, for some reason, Galaxy was initialized using the & (runs a file in the background so a keyboard interrupt is not possible), then you have to find the process number in order to stop Galaxy.

This can be done through the following process:

top

Using the "top" command allows you to view all the processes currently running on your machine. Galaxy is instated through a Python process. Finding the process number for a large Python process is the key here. Once found, make note of the number. Then the following command can be run to stop Galaxy:

sudo kill -9 $processnumber

where $processnumber is the process number of the Python process. Typically, kill -9 is frowned upon because it can lead to really bad cleanup on the computer-end. My simple solution to this is to not run Galaxy in the background. Just have it in a place where you can easily stop it. An alternative is to issue the restart command to Galaxy. Navigating to the /galaxy-dist/ directory and running:

sh run.sh --reload

This allows Galaxy to reload (restart).

All things considered, I think my team is really starting to understand Galaxy and I believe this testing project is going to go super-smoothly.

Music listened to while blogging: alt-j

*Update: We received access to our team repository as I was writing this blog so this will be updated accordingly.
The repository is now located at: https://svn.cs.cofc.edu/repos/CSCI362201302/team3/
Really, the only difference here is that instead of using /playground/Turner/ as the base of operations, we will be using /team3/ as the base of operations.

Wednesday, September 18, 2013

FOSS Project Decision: Galaxy

As I mentioned in my previous post, I am a part of a team for work on a project. This project is centered around a single FOSS Project. We have selected Galaxy as the project we are going to use.

Galaxy is probably one of the best choices for this project for a multitude of reasons. For starters, I am not completely unfamiliar with Galaxy as I know some of the inner-workings. Most of the implementing done in Galaxy is through the use of Python and XML. Galaxy, on a local instance, makes calls to a machine's terminal (or command line) through "tools", a term Galaxy has coined.

These tools are merely abstractions of Python files.

The XML is used as a way to set up the abstracted, user interface that Galaxy produces. All of the inputs, text fields, dropdown boxes, checkboxes, etc. are formatted through the XML. The XML file also has a reference to the Python file that will be run. Lastly, the XML has a listing of the command-line arguments. For example, if after the Python file name if you wanted to have an input, then you would put that input right next to the Python file name (space-separated, as command-line arguments always are). Below, I will detail a full example of a small XML file:

<command interpreter="python"> example.py $input1 $input3 -z "$results" $putin</command>
<inputs>
<param name="input1" size ="4" value="1054" type="integer" label="Number Input"/>
<param name="input3" type="integer" label="A Second Number Input"/>
<param name="putin" type="select" display="radio" label="Select One">
<option value="option1">This One</option>
<option value="2option">Or This One</option>
</param>
</inputs>
<outputs>
<data format="csv" name="results" from_work_dir="results.csv" label="CSV Results"/>
</outputs>

<tests>
<test>
</test>
</tests>
<help>
</help>

So, as you can see, this is a very flexible system. As long as your python file is the first argument in the command interpreter, then you are fine. I used different input names following the call (names do not matter, but usually there's a convention created for readability). The parameters are pretty self explanatory: 2 Integer inputs and a Radio button selection. The values produced from this are indexed by their command line argument. So getting the first integer would be as simple as going to sys.argv[1] (as the filename for the python file is at [0]). Radio button values take on either "option1" or "2option" in this case, even though the GUI presents them as "This One" and "Or This One".

One functionality I do not know much about, but will become increasingly important for this project, is the test(s) area of the XML file. Hopefully this is where I can specify all test cases that need to pass in order for a tool to be 'functional'.

Considering all things, Galaxy may seem convoluted at first, but it is a very useful project for various uses. It can be extended to really do anything as it merely makes command line calls. So with Galaxy, the galaxy's the limit.

Music listened to while blogging: Travis Barker, Notorious B.I.G.

Monday, September 16, 2013

FOSS Project Experience and On Visual Formalisms

For this post I will be reflecting upon my group's FOSS (Free and Open Source Software) project experience and briefly reflecting upon On Visual Formalisms

My Team: Boolean Bombers

Cam Spell
Logan Minnix
Rob Hambrick
Tyrieke Morton
and Me!

In class we worked extremely effectively and efficiently, so the group could not have been paired up any better. Currently, we are looking at the following FOSS projects for use in CSCI 360: Tor, Celestia, and Galaxy. I have a personal preference for Galaxy as I have a significant biology background (having my Data Science concentration being in Molecular Biology). Additionally, Galaxy does not have as much of a learning curve as some other FOSS projects. I feel this way because Galaxy is largely written in Python (a language everyone starts with here at the College of Charleston) with some XML on the side.

The On Visual Formalisms article reminded me of the daunting experience I had in my Introduction to Abstract Algebra class here at College of Charleston. It had a lot to do with set theory and the mapping of functions across different sets, which is not a hard topic in-and-of itself, but also having this as the introduction to formal proofs class is what makes me feel that this memory should be repressed. I did take away all the knowledge from that class and I am able to apply it to everything I know today. I had taken that class before ever thinking about taking a computer science course so whenever I see people using union operators or saying that functions map 1-1 onto R^N, I know what it means in a heartbeat. That being said, the set data structure was the easiest to understand whenever I stepped into Programming II (Java). I have digressed a bit here, but I feel that is almost the point of a blog sometimes. All in all, graphs, sets, and anything of the liking have become somewhat a strength of mine. I may not have ever heard of the term 'Hypergraphs' or 'Euler Circles', but it was pretty easy to pick up on what the author was trying to convey. To put it simply, applying set theory principles to graph theory. In graph theory, you can connect points and add direction to the connection if desired. That is pretty basic, though. With these Euler Circles you can actually relate three or more points at a time. In fact, you are relating sets (picture crazily-shaped venn diagrams). You can use operations to check the intersections, complements, etc. In hypergraphs, shapes, locations, distances, and sizes do not matter. If you go to the article I hyperlinked at the beginning, then you can see some of the abnormal looking graphs. Regardless, at the end of the day, a hypergraph represents everything you can do with just one set. Euler Circles however, are ways to relate entire sets through structure.

But why? Why do we care? Can this help us in software development? Well, the immediate example that comes to my mind is the diagramming consequences. You could represent a class, fully with superclasses (or interfaces) that are implemented and show a hierarchy while gaining meaning from the different types of graphs you are using. You could break down people involved in a company in this manner. There may be a person interface from which everyone inherits. Customers may be allowed to perform certain functions with the company (such as put in requests). Employees, however, could be broken up based upon their positions, gender, etc. Between all the people you could have attributes you would represent in a graph, such as isMarriedTo, livesWith, etc. These arrows would connect these different subclasses. Essentially, this feels like an alternative way to represent a class diagram, to put this in terms of UML. This most recent example is actually exemplified by the use of higraphs.

Music Listened to While Blogging: Kanye West

Tuesday, September 10, 2013

The Mythical Man-Month

For this response, I will be responding to the following article: The Mythical Man-Month

For starters, I think it is important to note that just because a program works does not mean it is done. I can write a program that calculates the traveling salesperson problem (just a hypothetical, simplistic example, unfortunately) in polynomial time, but if there is no documentation, no way for anyone to use my results, no way for someone to read my code, then what is the point? It would be much better to have this program written with ample documentation, readability, etc along with directions for setting up the extinsibility (assuming open-source). Also, just because the program works does not mean it is the best. What if the program I wrote only works on my souped up computer that has terabytes of space. What if someone wants to use this on a lighter rig? There is no way that anyone with a reasonably-priced computer would be able use it. Memory space, I/O devices, and computer time are all important things to consider in code. Getting the program working is merely part of the overall process that is software development and engineering.

Let's say you do get the program working. Well, how long did it take? Did it take longer than you expected? The answer is most likely yes. The Mythical Man-Month cites the optimism of programmers and this makes sense in the context of this situation. At the beginning of a project, you are to over-exaggerate your own planning and computer skills. This only makes sense though because if you do not look at yourself and your skills in a good light, then how would you even get a job or have the motivation to do anything with your skills. Being an optimist produces results.

I had never heard the term man-months and did not even know what it means. A man-month is a way to refer to payment for a software project (people working on project times the months to complete the project). It makes perfect sense that this would be a terrible measure, in general. The author mentions the non-interchangeability of these two variables. If this were a good measure, then I should be able to double the speed of a projects completion by doubling the amount of people I have working on a project. It just doesn't work that way. Explaining this would require the use of a logarithmic graph (Obviously this is not always true - I feel it is just the typical case). This is because you can reach a point where you can have so many developers that one more is not going to add to the speed or quality of the project in the end. You can also have a project that is doomed to never complete due to bad design and poor requirements elicitation from early-on. These projects will not be able to make it to completion unless major changes are made. You cannot simply just hire someone else and have a linear relation between months and workers. As the author puts it, sometimes there is no threshold of people that can even affect time (the analogy of no matter how many women we introduce, there still is just one period of being pregnant that has a fixed time).

The next few sections of The Mythical Man-Month pretty much boil down to a few things that I stated here and a few others. Never undercut the time you think you need on a project. Always leave time for debugging and fixes - even "perfect" elicitation of requirements can still result in a number of bugs and issues. Make sure each member of your team fits into the role that they have on their project and make note that anybody can act in any role, but defining primary roles is always important (ex// a Software Architect can aid in the writing of a Systems Test). This fits into the 'Surgical Team' analogy where roles have been divided up. But even in surgery, anything can happen, anything can go wrong, and improvisation can become a necessity, as aforementioned. The improvisation is a little different from the author's intention, but I do feel that it is a corollary that needs to be added.

Considering everything, it is important to have a well-defined role in a project. It is crucial to have realistic, not to imply optimistic cannot be realistic, goals and deadlines. It is vital to be able to understand other parts of the project. I added this last point because I feel like it just needs to be said. If I'm a database administrator, I can work all day and really not know what is happening on the front-end of my application, but is that what we really want? Of course not. I would need to understand what is happening in the application. The nature of the security of the information may not be evident until understanding the rest of the application. All things said, the man-month is a ridiculous idea and this myth has been confirmed.

Music listened to while blogging: Robin Thicke & Sublime

Monday, September 9, 2013

The Future of Software Engineering and Programming

When formulating this response, I considered two articles in addition to my own thoughts and opinions:
The Future of Programming
Lifecycle Planning

The Future of Programming, by Robert Scoble, had me thinking before I even started to read or watch the video. I saw that the article was talking about programming in the cloud (using Cloud 9 as their example). Last year, I purchased a Chromebook and I was searching low-and-high for a cloud-based IDE that would allow me to execute and store code reliably. Cloud 9 was the platform I eventually landed on. While Cloud 9 is not the greatest IDE. For example, hidden characters can get included in files and cause issues - my problem was hidden characters coding for spacing and, using Python at the time, resulted in terrible, relentless issues only solvable by using VI. Which brings me into my next point: the terminal. Cloud 9 builds a lightweight virtual machine for each of their users. This has a multitude of benefits including added security (it is not possible for one user to tamper with another), terminal usage (great for developers, like me, who move around using the command line a lot), and structured file management (as is inherent with a full virtual machine). I only used Cloud 9 for python development, but, from the video, it is so evident that Cloud 9 is a multi-purpose IDE. I could develop a full-fledged project with a group of collaborators because of the cloud-based nature of this program. The cloud-based environment allows for live collaboration. This means that version control is not an issue because merge conflicts just will not happen if all the coding is done on Cloud 9. Additionally, Cloud 9 is not slow (unless you have not logged in for a few weeks and they have to reboot your virtual machine). I was able to conduct pretty heavy duty genetic algorithms on the cloud with Cloud 9 when I was in my Data Mining class. All things considered, programming, in the future, is not going to require all these base installs. We're slowly moving to a cloud-based world and cloud-based solutions are going to be the future. As a user, do you want to navigate to a site, download a program, install it, then have the ability to run it? Or, as a user, would you rather just navigate to a site? The answer is obvious.

When considering Software Engineering, it is always important to consider your design model. Most everyone involved in any form of Software Engineering is familiar with the waterfall model. The waterfall model, essentially, flows like this:

Software Concept <-> Requirements Analysis <-> Architectural Design <-> Detailed Design <-> Coding and Debugging <-> System Testing

My biggest problem with this model is that it is impossible to have a complete idea of the Software Concept at the beginning of the project. Also, the more important reason, Requirements Analysis is such a tedious problem to break down into all of its atomic elements from the beginning. Software developers do not know what the specific details of each feature are from the very beginning. It is a continuous process of finding out new things about the requirements, whether it is from limitations you find out later, a client that did not detail everything the correct way, etc. While the waterfall model is bidirectional, climbing back up the waterfall is always described as 'being difficult' (going left with the way I detailed the model). There is later mention of an overlapping waterfall model, but, really, that is just an attempt to grab up the good from the spiral model and add it to the waterfall model. This is because you can go back phases and elicit more requirements/functions/tests/etc, but all without ruining your model (more on spiral later).

Next, I will address the Code-and-Fix model. This model is pretty funny to me because it really is the model programmers use when they first start coding. You do not really know the structure of your applications and you just build in new features without good documentation or test cases. It really is the trial-and-error and 'hope I remember' form of coding. This is a bad form of coding because you never remember everything. Unfortunately, there are a few pieces of my big project right now where I have adopted this approach, but it is slowly undergoing major change in order to meet documentation and requirements analysis standards.

Next, the spiral model, I feel, is the most popular and efficient approach to software development. This is because you will never be able to elicit all your requirements, as I've previously mentioned, at the beginning. A continually developed set of requirements is the way to go about a project. This way new features can be added as needed, tests can be developed with software as new cases or exceptions are thought up. All in all, starting on an extremely small scale and steadily growing is how most projects tend to fall for these very reasons.

The evolutionary prototyping model is interesting because it really means you are having trouble with the requirements. I feel like this model is good for showing the customer what their software will look like in the end, but without all the tedious backend development because the customer could just be ignorant of what they actually desire. This process runs into issues with time management because a lot of time can get wasted pretty quickly on designs that the customer rejects.

The Staged Delivery and Design-to-Schedule models are extremely similar so I will consider them together. The heart of each of these models seem to be rooted in the same place as Agile Development. With deliveries staged throughout the design process, the developers never fall behind and, if a customer does have a problem with a requirement, then it can be fixed earlier rather than later. This is the preferred method of software development and software engineering, I feel. The idea of conducting specific tests on specific pieces of software built in iterations is the best way to do it. You can set up small test cases and build features that only fit those tests and do this rather quickly because the problem has been broken up into an almost atomic manner.

Considering everything, the type of software development and engineering cycle you use will depend on the type of project you are conducting, but there are reasons to use and not to use each of the models. What does this mean for the future of programming, though? With cloud resources becoming easier and easier to consume, then does that mean the software development and engineering cycle could become the same way? Yes, yes it does. Just the mere fact that you can collaborate with someone on a single file from the opposite side of the world at the same time without running into version control conflicts is amazing. If I'm a customer, then I can test a product's (provided I have a slight amount of domain expertise) features without having to be in a physical meeting with a software team. This means that meetings can also take place over the cloud and could be even more efficient than in the past. The future is now and we should all embrace it.

Music listened to while blogging: Mysonne & Tech N9ne

Thursday, September 5, 2013

CS 360 Homework 7

For this post, I will be responding to a few articles

The Magical Number Seven, Plus or Minus Two
Having taken cognitive psychology classes in the past, I was very familiar with topic-at-hand here. The article talks about how you can only hold about 7 things in your working memory at once (+ or - 2 depending on the person or situation). This feels like a pretty arbitrary schematic for detailing working memory, though. What would actually constitute one of these 'things'? A better way to detail working memory, I feel, is through detailing of the Central Executive. The Central Executive is an idea posited by Alan Baddeley to model working memory in 1974 (it has been refined since then). So instead of just remember 7+-2 things, you have three main areas in which you can process - the phonological loop (language), the visuospatial sketchpad (visual semantics - often abbreviated "the pad"), and the episodic buffer (short-term, episodic memory). This article refers to Miller's law; Miller's law has stood the test of time and is generally accepted, but there tends to be plentiful evidence against the 5+-2 from outside cognitive experiments. I believe Miller's law should just be taken as a rule of thumb for working memory, and not as actual factual evidence. Additionally, Miller's law did touch on things that Baddeley's working memory model covered, but Baddeley's model just feels more elegant and has a lot more backing.

Security and Privacy Vulnerabilities of In-Car Wireless Networks:
A Tire Pressure Monitoring System Case Study
It is a known issue that cars can be hacked and terrible things can occur. The most common example when talking about cars getting hacked is the tire pressure monitoring system. According to this article, these can be hacked from as far as 40 meters away. These authors, in their paper, clearly detail out how they conducted this experiment with ample graphs and descriptors. Also, they clearly followed the scientific method - they found a problem, tested it with two different tire pressure monitors, and reached conclusions which are all clearly outlined. Honestly, I wish the conclusion to this experiment was different. It is a little nerve-wracking knowing that the wireless signals being bounced around my car barely have any encryption (some signals having none).

Planning for Failure in Cloud Applications
Cloud applications. They're great, they're awesome, and they rock. You can access them from anywhere, anytime. If something seems this good, then there are bound to be some downsides. I have my own cloud-based application (Learn2Mine) and it goes down occasionally, but 95% of the time it goes down it is because we're relying on outside resources to keep our system safe and sound. Specifically, things as simple as Google APIs can sometimes take us down if their authentication causes a hiccup, which is known to happen. Other times, however, Portal (at CofC) will be down, which restricts users from 2/3 of the cloud application. This article does have me reconsidering our 'failure' pages for when services or certain operations do not work. It is a lot less shocking when you see a "this page is temporarily down and will be back up soon" type of page rather than seeing a "500 INTERNAL SERVER ERROR - HIDE YOUR KIDS AND WIVES, WE DON'T KNOW WHAT'S HAPPENING" kind of message. These different pages really give a different form of connotation. In my eyes, it seems like if developers have taken the time to customize an error page like that, then they probably know what they are doing. Also, I had not considered creating 'retry' blocks of code. I have NoSQL database that I reference and if a call fails, we make the assumption that a user may not be created. We could probably run into some weird errors if this hiccup were to happen to our application.

Music listened to while blogging: J. Cole

Wednesday, September 4, 2013

CS 360 - Requirements Engineering

For this post, I will be responding to questions/scenarios about Requirements Engineering from Sommerville's 9th edition of Software Engineering.

Using the technique suggested here, where natural language descriptions are presented in a standard format, write plausible user requirements for the following functions:
-> An unattended petrol (gas) pump system that includes a credit card reader. The customer swipes the card through the reader then specifies the amount of fuel required. The fuel is delivered and the customer's account debited. (customer = user)

Alert USER whenever credit card is declined
Alert USER whenever credit card is not read correctly
Alert USER if card reader is not working correctly
Alert USER if amount of gas in the pump system is lower than their specified gas amount
Stop flow of fuel whenever the USER inputted amount has been reached
Start pumping fuel whenever USER initiates pumping (mechanism not specified)
Print out receipt when finished pumping fuel
Deduct funds from the account of the USER upon finishing pumping
Alert USER if funds are not sufficient for the purchase

-> The cash-dispensing function in a bank ATM

Report insufficient funds to USER
Report empty ATM
Report invalid input for cash retrieval
Report insufficient amount of specific denominations
Dispense cash whenever prompted
Deduct funds from account upon dispensing cash

-> The spelling-check and correcting function in a word processor

Underline incorrectly spelled words
Analyze syntactical meaning of sentence with corrected word
Allow USER input to add new words
Allow USER input to add new definitions
Allow USER to override corrections

Suggest how an engineer responsible for drawing up a system requirements specification might keep track of the relationships between functional and non-functional requirements.
Using diagrams can be very helpful in these types of situations. Drawing up UML diagrams, for example, is an extremely fruitful way to go about this. Sequence diagrams, for example, are useful for representing functional requirements as well as some non-functional requirements. Any function that has a lifeline with a sequence diagram would be a functional requirement because that is something that is required by the system itself. That is an explicit, specified method. The time that the function is supposed to hang around could be a potential non-functional requirement. Perhaps there is a requirement that a user be notified within 5 seconds of something occurring; this would be a non-functional requirement. The sequence diagram example is just an example - all types of UML diagrams have their time, place, and meaning.

Using your knowledge of how an ATM is used, develop a set of use cases that could serve as a basis for understanding the requirements for an ATM system.
This is not an exhaustive list of requirements, by any means:

A USER withdraws a given amount of money
A USER deposits a given amount of money
A USER inputs an incorrect pin
A USER deposits a check that can be read
A USER deposits a check that cannot be read
A USER inserts too many bills for a deposit (30+)
A USER inserts too many checks for a deposit (30+)
The ATM is out of cash
The ATM does not have sufficient funds for the USER's withdrawal
A bank card cannot be read
A bank card is left in the ATM
A bank card is invalid
A bank card is flagged as a potentially stolen card

Use cases stemming from from these requirements can proceed as follows (again, not exhaustive):

1. A user goes to the ATM. The user inserts their bank card into the ATM. The user keys in their pin. The user deposits checks (no more than 30). The user confirms the check totals on the screen. The user finalizes the deposit. The bank card is returned to the user.

2. A user goes to the ATM. The user inserts a fraudulent bank card. The ATM reports that the bank card is invalid. An external company is notified (i.e. the police).

3. A user goes to the ATM. The user inserts their bank card into the ATM. The user keys in their pin. The user attempts to withdraw more cash than the ATM has left. The ATM reports that it does not have sufficient funds.

Music listened to while blogging: J. Cole & STRFKR

Monday, September 2, 2013

Subversion Repository Experience

So for this post I will blogging about my experience setting up a subversion repository for my CS 360 (Software Engineering) class. We are setting these repositories up for future work in the class (will definitely be talked about in later posts) and setting up this software itself is a learning experience.

In the past, I worked with Git under the brands of GitHub and BitBucket, and Subversion (SVN) has me missing them. I have done a little work with Git repositories at the command line, but, typically, I used a GUI for all those troubles because the GUIs make dealing with these version control systems really simplistic (the specific GUI I used back then can be obtained by using the "sudo apt-get install git-gui" command from any flavor of linux platform). The only issue that I ever had with those GUIs were the merge tools that they provided. There are other tools for that, though, and that is beside the point here.

SVN confused me at first because I'm using Linux Mint as my system. I read a few things from Mint support sites and Stack Overflow that Mint has had issues with SVN for a few of its releases. Luckily, 15 (Olivia) was not of that nature. From that point, it was smooth sailing to install SVN. Cloning the class repository was also straightforward and was easy to do with just a few commands. The only thing that threw me for a loop was when I noticed a lack of the description of the "pull" command from the introduction I was reading. Typically, I've found, committing a change is not enough to expose your results to the web. With SVN, it seems, that committing is the final step (and detailing the commit is just as easy) to the entire process. I want to read into what implications this has for merging files for future reference. This is because with the git programs in which I am familiar, you have to conduct a commit and then do a merge before doing a final push to expose your changes to the web.

All in all, setting up the repository was a good learning experience because I now know how to use another file version control system. The difficulties were minor in that this experience was not one meant to cause too much trouble.

Here is a link to the repository.

Music listened to while setting up repository and blogging: 50 Cent, Ozzy Osbourne, & Wale

Sunday, September 1, 2013

CS 360 Homework 5

For this post I will be responding to 6 articles:

An Investigation of Therac-25 Accidents (Nancy Leveson & Clark Turner)

After Stroke Scans, Patients Face Serious Health Risks (Walt Bogdanich)

FDA: SOFTWARE FAILURES RESPONSIBLE FOR 24% OF ALL MEDICAL DEVICE RECALLS (Paul Roberts)

The Role of Software in Spacecraft Accidents (Nancy Leveson)

Who Killed the Virtual Case File? (Harry Goldstein)

IG: FBI's Sentinel program still off-track, over budget (Gautham Nagesh)

Rather than break down each article and talk about them individually, I will, rather, break my response up into different points, citing examples and arguments from the articles. Citations will be using last names (distinctions will be made for the two Leveson articles).

So, really, what is going on here? Software is failing and killing people. This is a complete violation of dependency principles I listed in my previous blog post, namely the safety and reliability principles. Who is to blame for these accidents? Developers? Users? The software itself? While we could play the blame game for hours trying to debate who really is at fault, I will end that argument early and say that these issues are everyone's fault.

In the Therac-25 accidents (Leveson & Turner) there are multiple facets in which you can approach the issue-at-hand. The developer plays a part in the blame because of the terrible interface design that resulted in cryptic, meaningless error messages and because when trying to patch these life-threatening bugs they failed the first few times. At least they tried, but there are fundamental flaws that Leveson & Turner detail. The issue that sticks out the most to me is their unit testing flaws. With the right unit tests, it would be a lot more difficult for bugs to creep up and rear their heads. Additionally, Leveson & Turner state how documentation should not be an afterthought. As a programmer, I wholeheartedly understand how drab annotating software and writing ample documentation can be, but I also understand good software engineering practices. You have to have good documentation, you have to document as you code, and you have to make sure it is good enough so that even Joe Schmo, who happens to be an okay programmer, can read it and know exactly what is going on with the code.

The article written by Paul Roberts (FDA:...) states that software quality is becoming a more and more emphasized interest in the eyes of the FDA. This makes absolute perfect sense considering all of the tragedies from the articles. Roberts talks about how there was an instance of an AED containing a vulnerability that would allow unsigned updates to be allowed to push through the AED. So anyone with working knowledge of how these devices work could potentially silently take the life of anyone with the device. Obviously, this is an enormous problem. This issue mirrors an issue I saw in a Ted Talk (All Your Devices Can Be Hacked ~ Avi Rubin). This talk showed how many devices could be hacked to perform duties and operations that should not be allowed. For example, a car could be hacked to do things as innocuous as changing the radio station all the way to manipulating the signals coming from the tire pressure gauges. The implications of software coded without considering the principles of software engineering are always terrible. To reflect upon an earlier blog post, maybe there should be some sort of certification or test to allow people to work on software that could lead to the threatening of lives. Essentially, employers should make sure that they know whom they are getting in bed with before hiring them to work on major projects. So burden is shared with project leaders and employers whenever software does not work as expected.

There are other issues that can arise with software projects. Say, in development, there may be a terrible amount of inefficiency. Take the Sentinel project, for example, there were so many problems that arose, as Nagesh details. These problems lie within requirements that should have been clearly outlined at the beginning of the project. This project failed on the same level as the projects we have mentioned previously, but the consequences here are of a different nature and caliber. In the radiation incidents there were consequences where the taking of lives was involved , but here the consequences tend to fall around the loss of lots of money and time. While it is obvious that the radiation incidents had the worse consequences, the nature of the Sentinel project still fell in the realm of inefficient and terrible software engineering practices. This very same idea is recapitulated with the spacecraft incidents. Software has been the cause of a lot of the accidents, such as with the Ariane 501. Bad software caused a lot of those crashes, whether it had to do with bad programming (engines failing) or whether it had to do with user-error (reporting in different units - imperial and metric). There were less harmful faults, such as the SOHO issue where communication was lost for 4 months. Really, all the issues being examined here either caused the loss of life or the loss of lots of money.

All things considered, good software engineering principles lead to good software that conducts as expected. Whereas conduction means it was conducted with the target cost, within the target time, and works with efficiency expected by the customer. There is also a significant amount of user errors that can be glossed over easily (as with the units error) that should be specified within requirements elicitation. Good requirements lead to good software.

Music listened to while reading/blogging: Ellie Goulding & Jay-Z
TV watched while blogging: It's Always Sunny in Philadelphia

Clayton Turner: An Introductory Guide