Monday, March 17, 2014

Capstone: Walkthrough of the Tutorial for Learn2Mine

So Jake and I have cranked a lot of work out for Learn2Mine recently. The bulk of this work has been related to finishing the tutorial for Learn2Mine and getting any issues with it settled.

When you navigate to Learn2Mine's home page you can get select "Take the Tutorial!" and jump straight in to the tutorial (as I've explained in previous posts). So the tutorial introduces users to the three components of Learn2Mine, which includes creating a Galaxy/RStudio account.

So let's stop right here. RStudio accounts used to need a tool for creation and now users do not have this extra level of confusion to try to create an RStudio account. Their Galaxy account is their RStudio account now. The only issue we currently have is that users cannot change their passwords on either of the components of Learn2Mine - something that will be solved soon.

So back to the tutorial. Jake did a great job making images that explain the different sections of Learn2Mine and Galaxy and these snapshots can be seen below:


So now the users of the site can get started with the tutorial. The first tutorial section is the "Basic R Tutorial" section. Here, users are asked very simplistic questions programming-wise. For example, the first problem is asking users to create 2 integers (and type declaration is not important in R so users can simply say "x = 1234" or "x <- 1234" if they desire) and perform some mathematical operations on those variables. The users are to do this in RStudio.

Now they have the option to submit this first problem now if they want and we give them a quick how-to on submitting by clicking the "How Do I Submit?" button in the bottom left of the tutorial pages. Let's say the user has moved on to one of the harder tutorial lessons, though. What if the user is having a hard time because of a lack of exposure to R and needs just a little jumpstart in how they should start their coding?

Well, I worked hard on developing javascript code (using jQuery) that will take example code that I created and print it out to users (either all at once or line-by-line). This is especially useful in the last coding section of the tutorial. Users are asked the following (page available here):

Write your own knn function called my.knn that takes 3 arguments: a training file, a testing file, and a k value. The function should use the training file to develop a set of euclidean distances in order to find the nearest neighbors for the records (rows) in your testing data file. Finally, you want to return a 1 column matrix which contains the correct labels for the testing data. The nth row in the test labels matrix corresponds to the nth row in the testing dataset. Though your kNN algorithm should be generalizeable to many different datasets, you may use the training and testing datasets we provide for you below in order to test your function.

Below this we give users a Training Set and Testing set for this problem (which I completely made up - it is meant to be simplistic). We also give users the signature for the function (as our automatic, instant grading is based upon matching the signature first-and-foremost). So users can either click the Hint button in order to get insight as to how to tackle the problem or they can just show line-by-line or full answers to the problem. Now I made a note on the page that the code we provide to them is not the most efficient - and it was never meant to be. Rather, though, the code was written in a way where the users should be able to read the code and understand it without the need for a lot of comments. If the users can do this, then the tutorial was successful because the user then understands R code and can actually get their hands dirty with actual R lessons on the site.

Let's say the users have finished those R lessons though, then they can move on to the tutorial tool lesson. Here, users will be performing k-NN with our built-in k-NN tool on Galaxy. Users just have to provide a training dataset, a testing dataset, and a k-value (the same as the signature that we had them write a function for). The tool is then run and gives users an HTML output and automatically grades the lesson upon submission.

Music listened to while blogging: Childish Gambino & Tech N9ne

No comments:

Post a Comment