CS 371P (OOP) Reflections: September 2012

Sunday, September 30, 2012

Week 5 (9/24-9/30) - Cruising Along

For over a week I've been up against deadline followed by deadline. Following the submission of the Collatz lab on Wednesday of Week 3, I had to write an assembler for LC-3 for EE 460N (the ECE Computer Architecture class), which took me all the way until Sunday night, with the project due Monday of Week 4. That Friday, I had a computer graphics project due, which took almost all of the rest of week 4, but I had to also find some time to work on the Voting project, since pair programming isn't exactly something you can cram in at the last minute...

End of week 4, I attended the ACM LAN party right after turning in my graphics project and then got started on the next computer graphics project due Monday. OOP due Wednesday. Graphics due Friday. At least I got a cool 2D Billiards game out of it.

But by Friday's OOP class I had pulled so many nearly-all-nighters (3-3.5 hours of sleep per night), I was exhausted and fell asleep in lecture. I'm glad I wasn't called on. Or at least, I hope I wasn't called on...

Anyway I caught the gist of the lecture, a discussion about the next programming lab. I've written a memory manager in CS 439, and studied an implementation of a memory manager in EE 312, so I'm sure this project won't be too difficult to understand. But I know better than to think that it will be a short project.

With the midterm coming up, I can feel the panic rising. I'm not actually panicking yet, but I feel like I should be. This past Wednesday, I understood everything on the quiz and yet somehow misinterpreted everything, and got 0 right answers. That's at least a good learning experience. For Friday's quiz, I did the reading in what I thought was sufficient detail. Turns out even when you do the reading, if you happen to focus on the wrong details, you're going to get everything on the quiz wrong. So essentially, there goes two more of my dropped quizzes.

Now's when that familiar fear/nerd rage is going to come in handy. Armed with no project deadlines this week, I can actually take some time to study up for all of my midterms. Also, I will take care not to forget about my projects.

I can only hope things work out.

See you all on the other side of the midterm!

Sunday, September 23, 2012

Week 4 (9/17-9/23) - Source Code Organization

Definitely had a rough week. It was interesting to learn about the way exceptions are handled in C++. It still seems like the Java way of dealing with exceptions is a little bit better. If you're going to go through all the trouble of inventing a system that could cause your program to halt if used incorrectly, at least in Java, the use of Checked Exceptions requires you to tell someone about it. In C++ if you miss catching an exception that you didn't know about, your program halts/crashes and there's nothing you can do about that once it's in production except to track down the cause of the problem.

I still haven't quite gotten a handle on the daily quizzes. I forgot to do the reading but for some reason I had convinced myself that the quiz over the reading wouldn't be until Monday morning. I definitely won't be making that mistake again. Luckily there are dropped quizzes to make up for the "duh" moments like this.

The quiz mentioned the canonical way to build classes in C++. However, beyond mentioning the canonical class methods, we didn't go into a whole lot of depth, although I wish we had. So, I went and looked up an article that would tell me a bit more (http://www.cplusplus.com/articles/y8hv0pDG/). It's nice to see a write up but I hope that we still visit these topics in more detail during the lectures.

Project 2

The project was going pretty well until Saturday. About 9 pair-programming hours into the project and we thought we had the problem solved. Then we submitted to UVa and got a Runtime Error. We figure something went wonky with our assertions. Maybe a first troubleshooting technique would be to comment out all the assertions and see where that gets us. We definitely need to do more acceptance testing. Perhaps "random" inputs will help us discover the Runtime Error on our end. We definitely plan to take advantage of the public tests repo to help us figure out this problem.

Source Code Organization

Something that still feels wrong about the structure of the projects is that the executable code is being written in a giant header file. Now I know this isn't the software engineering class, but shouldn't we be paying a little more attention to good coding style?

I mean, a good way of handling a submission would be to let us turn in whatever source files we want for the actual submission, and provide a Makefile that will generate the required executable that they will use for testing. Makefiles aren't all that hard, but we could even be provided with a template Makefile and given instructions on how to use it. Considering that the class is already having us learn Git, it would make sense to have us learn Makefiles as well, which is another, more ubiquitous, and equally useful technology.

Anyway since my partner, Tri, and I wanted to be a little more organized, we made our class definitions in separate header files which we include in the Voting.h file.

We will have to remember to move those definitions directly into our source file before submitting our project.

Sunday, September 16, 2012

Week 3 (9/10-9/16) - Collatz Wrap-Up

As the first project wrapped up, my biggest frustration was figuring out how to write valid acceptance tests essentially using the very logic that should be under test. I settled on the reasonable [I think] assumption that if my solution worked for sphere at any point then the acceptance tests I generated from that particular version of the code would be valid. I wrote acceptance tests that would exercise both large and small ranges, and which would run long enough that I would be able to see a real improvement when I implemented my metacache. I decided to save a lazy cache for later ("if I get to it") because I've done similar work with dynamic programming and a metacache is something new.

Many unit tests later I finally got my metacache to work and it is very satisfying to see how blazingly fast the program becomes when you make that change. I turned the program in with a less-than-optimal metacache, which was accepted by V2 of the Sphere problem.

However, after turning in the project, I pushed the cache to the max source file size, and only got a little less than 2x improvement from the submitted version, which is really small compared to the almost 20x improvement between no cache and the first version of the cache.

CACHING

So I wanted to a take a moment to reflect on what I learned about caching from this project. Rule of thumb: a good time to use caching is when all or part of a result will not change as you perform more computations. In other words, if you've got some computation with a lot of intermediate results, and all of those results can be entirely determined from the inputs to some algorithm, it may be beneficial to save the results. A cache takes up space, but it can save a lot of time, and since memory is so abundant these days, it's not something you ALWAYS need to worry about. Sometimes a little bit of information can save you a lot of time. Looking up a cached result is a constant time lookup (using an array or hashtable), and even if that only gives you some information, that can save a lot of time.

LAZY CACHE

Since I've done a bit of dynamic programming the idea of a lazy cache seemed pretty familiar. Basically, you store intermediate results while you perform other computations, so that you can reuse that information when you need a particular result again.

Storing intermediate computations is similar to storing the result of a method call in a local variable, when your code uses that result more than once. You as a programmer might know that the method will return the same thing each time you call it, but most of the time the compiler won't be able to tell that nothing has changed and the program will actually call that method again. So, it is best to save a local copy of the result and use it in various places. Much less expensive than a function call.

I admit that working with a Lazy Cache might have given me an opportunity to learn something about the types offered by the STL, but after seeing the dramatic improvement my program got from using the Metacache, I thought that optimizing the Metacache would give even better results, more quickly.

EAGER CACHE

On the other hand you could spend the time calculating everything things before the program even does anything useful. This is best when you can afford to have a long startup time and your program will be running for a long time. Otherwise, you should think carefully before doing this.

METACACHE

Here's why a metacache is cool. In some applications, a lot of things that you might want to compute during the course of running the program don't actually change between runs of the program. Wouldn't it make sense to do some of this computation intensive work before the program even runs? Why not store the results in the program itself so that the only person who has to do the computations is the person writing the program.

In early programming classes, they tell you that storing results in your code is a bad thing to do because it causes your code to rely on work that you did outside the runtime of the program (not kosher for many programming contests, or for understanding algorithmic material). Sure you could make a lookup table to solve every problem, but then your computer isn't really doing the work for you, or you've moved the work to a different program, which seemingly defeats the purpose of writing the original program. However in a problem like the collatz problem where the range of inputs is so vast, you have more of a problem to deal with. Storing those results doesn't take the work from your algorithm, but it does reduce the time-intensive work that needs to be done.

In this iteration of the collatz problem (finding maximums), many of the results can be folded into bins which represent a large number of inputs, so the cache is not even as large as storing every result would be.

In my case, a first-draft metacache improved runtime by nearly 20x. A second pass at optimizing the cache brought that to over 30x improvement from the non-cached version. That's awesome. This is a great take-away from this project.

Monday, September 10, 2012

Eclipse 4.2 is UGLY (simple "kinda fix")

So if you're like me and you wanted to have a real IDE for the sake of working in a nice environment for C++ without having to learn a new tool, and if you're like me and you wanted to have the latest and greatest of CDT tools for Eclipse, then you needed to downloaded Eclipse 4.2.

If you're anything like me, you found the new theme very ugly and unpleasant to work with. Lots of unused dead space and a weird color scheme that seems to hide the separation of toolbar and content pane in inactive panes.

wat.

(^ If you haven't seen that video it's good for a laugh.)

Luckily, there is a bug open that is working to get this resolved. Hopefully.
https://bugs.eclipse.org/bugs/show_bug.cgi?id=362423

I won't bother posting screenshots because there are some posted in the bug.

Here's a simple fix, from one of the replies:

In the 'Appearance' preference page you can switch to the 'Classic' theme.

I'm much happier now, although the Classic theme still seems like a thrown-together and not quite polished version of the old theme.

I hope that helps someone else =]

Referenced blog: http://www.jroller.com/andyl/entry/eclipse_3_8_vs_4

Sunday, September 9, 2012

Week 2 (Mon 9/3 - Sun 9/9) - Settling Into a Workflow

This week was another chaotic week for me... Labor day was no class, Tuesday and Wednesday morning were lectures and trying to squeeze in as much work as possible, because I was flying out to Redmond on Wednesday afternoon for an interview with Microsoft's Visual Studio team on Thursday morning. Returned to Austin on Friday night.

Commence a weekend of trying to catch up for class on Monday.

Unfortunately, the majority of my time so far has been spent on OOP, so I haven't had that much of a chance to start projects or do readings for my other classes. Good news about that, the deadline for the first project in this class will be much sooner than for my other classes, so I've got some time to get my other classes figured out before my time away from school comes back to bite me in the butt.

As several other people have mentioned, the threat of an all-or-nothing grade for this assignment is making this a very stressful project indeed. I don't expect a zero, but I can't shake the feeling that I'm going to miss something important and get totally screwed.

Some good news, I finally figured out that I had an invalid assertion somewhere that was causing some of Sphere's inputs to get rejected by the assertion. Now Sphere is accepting my solution and I can move on to implementing a cache.

Labor Day Git Session

Labor day was definitely much-needed time to get some reading done for class and to get some extra practice with Git. One of the things I had been meaning to do was to source control my .bashrc and .bash_aliases (and various other settings files) so that I can easily transfer them between my various Linux boxes/VMs and the lab computers. Since I had some settings on several different systems, this gave me a chance to try out Git's branching and merging features.

From trying to use Git's branching and merging features, I've come away with some impressions about Git compared with Mercurial. Anyone who has spoken to me in class probably knows I prefer Mercurial over Git, mostly because Mercurial integrates better with Windows, but I've definitely come to like some aspects of Git.

One thing which really stood out as a clear advantage that Git has over Mercurial is that branches in Git can be deleted, and in fact they often are. This is one of the major changes in workflow that I'll have to get used to. Git encourages creating a branch to work on an issue, and then when the issue is resolved, merge the branch into master and then delete the new branch. In Git you can keep your experimental branches to yourself.

In Mercurial, the whole idea around using branches is somewhat different because branches are around forever and are part of your commits, so they are pushed to the remote repository when you do a push. You can't keep experiments to yourself, or overwrite history of the repository, so once you push to a Mercurial repository, it is there forever, even if you abandon whatever you were working on. This makes collaborating on experiments a much more difficult, costly, or embarrassing affair in Mercurial. If you want to create a "branch" in a Mercurial repository to do an experiment, Mercurial encourages you to clone the repository into a different directory, and work on your changes there, and then if you like what you did, merge your changes with the main branch. Branches in Mercurial are good for keeping track of collaborative work that spans a long period of time, or keeping an active Dev branch and then merging into the main branch when everything has been stabilized (á la many corporate systems for version control management).

I would definitely say based on the above that if creating branches to work on issues is your preferred way of doing things, then Git is the system for you. However, I have found that Mercurial gets along just fine without that and since there end up being more branches in Mercurial repositories, the branches tend to have more utility in Mercurial.

The big thing that's going to keep me with Mercurial though when I'm working primarily by myself and/or on Windows is the simple fact that Mercurial is written in Python and so is naturally cross-platform. This makes it a wonderful experience on Windows. Git for Windows bothered me so much that I'm now primarily working in a Linux VM or on the lab computers.

I was so excited when I learned about making command aliases in Git because that means a little less typing for me. However, it seems that the Windows version of Git is not able to deal with command aliases. Thanks Git. -.-

Settings Repository

Well anyway the happy result of my experiments with Git is that now my Settings repository is available on Git. If you'd like to check it out and take any of the snippets for yourself, feel free to have a look.

https://github.com/DemiReticent/settings

I rely heavily on aliases that will mostly only make sense on my systems, since I use them to quickly switch to specific directories in my file structure.

However, I do like the colorful prompt I've come up with. This started when I tried out Arch Linux for the first time and saw that the default prompt is colored, which makes it very easy to differentiate your prompt from the output in between, which also happens to make it easier to see when the output of a command begins and ends.

Here's a screenshot of my prompt and the specific lines you'll need to make this work on your system as well. (Assuming bash obviously.)

Project Progress

Now that I've gotten more comfortable with Git and I have my Linux VM up and running, I'm making a lot more progress. One of the most frustrating parts of the project is writing the unit tests. Since this project is primarily focused around computations which would be tedious by hand, properly testing the evaluation functions is either going to require that we trust that our program already did it correctly (which will be useful later after the unit tests have been created with the quick-and-easy approach and then changing the implementation to use a cache), or to find some table of solutions online, or having to compute the answers by hand.

I'm still working out the best way to do this, but either way I'm making some progress with adding unit tests to my project.

Does anyone else think it is funny that we can technically see other peoples' unit tests and acceptance tests? I'm not planning to look at anyone else's until I've done my fair share of my own work, but it could definitely be helpful to add some extra tests if we're allowed to borrow other peoples' unit tests and acceptance tests as extra assurance that our programs are behaving correctly.

Monday, September 3, 2012

Git on Windows, MSysGit Context Menu, and Git Bash

Well, as I mentioned in my last post, installing Git on a Windows machine can be a bit of a hassle, and if you're not careful during the installation (like me) you can end up with a context menu which is cluttered with Git commands that you'll rarely use unless you use Git for every file on your computer....

Here's a quick forum post which describes how you can clear those nasty context menu options out of your Windows Explorer context menu.

http://stackoverflow.com/questions/2459763/how-do-i-remove-msysgits-right-click-menu-options

Now, one of those commands (Git Bash) which opens the Git shell at the current folder location, is extremely useful. Unless you plan on using TortoiseGit for everything in your file system, you probably want to have it, because starting up the Git Bash from your start menu and then navigating to the folder you want is a pain in the butt.

CMD Solution

Well the obvious answer is to add Git to your path. For me, this was:

D:\Program Files (x86)\Git\bin

If you have Cygwin installed and you have Cygwin's version of Git installed (which is probably out of date), you definitely want to make sure that the Git path appears before the Cygwin path. For me, this looks something like:

D:\Program Files (x86)\Git\bin;D:\cygwin\bin;

One caveat: Git will incessantly annoy you with warnings that the terminal is not fully functional. Don't know what to do about that. If you can live with it, then you don't need the following solution.

Cygwin Solution (Hack)

If you've got Cygwin installed I've got a solution for you.

Cygwin's shell and Git Bash operate in entirely different environments but they treat paths similarly (using the Unix-style forward slashes as path delimiters and using the drive letter as part of the path).

If you add Cygwin to your PATH (on my box, D:\cygwin\bin), you can use those Unix tools from the Windows command shell (not just in your Cygwin Terminal environment). The particular tool we're interested in is the "pwd" command.

Navigate to your Git repo on disk.

For me this is E:\Users\Doug\Dropbox\Classes\12f\cs371p-oop\projects\cs371p-collatz

Shift-Right-Click and select "Open command window here" (see below)
Type pwd at your prompt, to get the directory name you're currently in according to Cygwin.

E:\Users\Doug\Dropbox\Classes\12f\cs371p-oop\projects\cs371p-collatz>pwd

/cygdrive/e/Users/Doug/Dropbox/Classes/12f/cs371p-oop/projects/cs371p-collatz

Copy the part starting with your drive letter (e.g. /c/, or /e/ in my case)
Run "Git Bash" from your Start Menu
Type "cd " and paste the path you copied. Hit Enter and you should see something like the following:

Doug@DEFIANT ~

$ cd /e/Users/Doug/Dropbox/Classes/12f/cs371p-oop/projects/cs371p-collatz

Doug@DEFIANT /e/Users/Doug/Dropbox/Classes/12f/cs371p-oop/projects/cs371p-collatz (master)

Congratulations, you're now exactly where you want to be, in your Git Bash, and you're ready to Git away.

Yes it's convoluted but you should be fine as long as you keep your command window open while you're working. Otherwise you'll have to go through the whole process again.

Okay fine, it's definitely more fun to hack around with things like this than to actually use them. I'm thinking that I'll work on a general solution based on the below section.

General Solution to Context Menu

I haven't quite figured out an actual solution to the context menu problem yet, but I'm thinking the following article may yield a solution after some study:

http://www.howtogeek.com/howto/windows-vista/how-to-clean-up-your-messy-windows-context-menu/

Basically, what I want is to get rid of most of the context menu options and be left with just "Git Bash", which only appears when you "Shift-Right-Click," like when you want to launch a command window at the current location:

I'll give that a shot later. Maybe...

Sunday, September 2, 2012

Week 1 (Wed 8/29 - Sun 9/2)

While the first week of classes usually feels pretty slow, this semester it's been pretty hectic. This is my first time living off campus, and commuting to campus from my apartment every day (sometimes multiple times a day) has been exhausting. I'm trying to work out how I can best spend the time between my classes, which I've never really had to do before since I could just go back to my dorm to work for a while. Additionally, with my new fitness goals this semester, I've been going to the gym more often -- and that hasn't done much to help the fatigue.

Normally, I would wish that Labor Day would come later on in the semester, but this semester I'm happy the first weekend is actually the 3-day weekend.

Preliminary Thoughts on the Class

The class has been pretty fair so far. I'm happy that the class will involve a considerable amount of coding in C++, and I'm also happy that we're not completely abandoning Java, which is far and away my most comfortable language.

I think that learning OOP concepts in a language-agnostic environment will be a good learning experience. By comparing and contrasting C++ and Java we'll be able to explore the key ideas of the OOP paradigm, rather than focusing on how one language happens to deal with things.

From what I know, the OOP implementations in C++ and Java also differ by a considerable amount, and I'm hoping we'll get around to discussing the design decisions that went into each language's implementation of OOP, and what tradeoffs were made in those implementations.

In Class

I was skeptical that the first thing we did in class was to compare a Hello World program in C++ and Java, but I was pleasantly surprised with the level of detail with which we treated that discussion. Seeing the parallels and the differences between the languages, especially in how they deal with symbols and namespaces, was very interesting.

Even though I knew that you could technically leave out the import statements in Java and still have access to all of the libraries, it hadn't occurred to me that the Java import command was analogous to the C++ using namespace... and that there is no Java language feature which is analogous to C++ #include, because package linking in Java, IIRC, is handled at the compile command. I haven't used the command line to compile Java code for a very long time, and I rarely use packages which are not part of the Java standard library, so I'm honestly not sure how that works anymore.

The class definitely got interesting fast when we started discussing the project. I had heard of the Collatz Conjecture before because of a Relevant XKCD, but I hadn't really thought about the problem too much since then.

The fact that an input number will cause the sequence to terminate at 1 means that the sequence eventually reaches a power of 2, seems like a very powerful result. Interestingly, before Prof. Downing made that point, I was thinking about that fact in a binary RegEx (numbers that match /10*/). When Prof. Downing observed that the powers of two act like magnets for the inputs to the sequence, that suddenly made that result seem very powerful.

I'm definitely looking forward to working on the project, once all of the administrative stuff is out of the way.

Foot in Mouth

As a result of my embarrassing post on the class forum requesting specifics on the already well-defined requirements for the first project of the class, I realized that I've been so scatterbrained since moving into my new apartment that I have evidently forgotten how to scroll down (and/or how to read).

To be fair, I looked for those guidelines on the syllabus, and the Sphere problem description, and everywhere else I could think of... Except I never actually SCROLLED DOWN the project page. *FACEPALM*

At any rate, I admit my mistake and I know I won't neglect to read the instructions in the future.

The Project so Far

So, following that incident, I decided it was time to collect my thoughts and just dig in to the project so that I could make some headway and encounter some real problems that I might actually need to ask about.

Startup

I had absolutely no trouble upgrading my GitHub account to a GitHub Student account. It was ready immediately after I provided my school email address.

The Sphere account was also exceedingly easy to set up. I even tried the first problem (http://www.spoj.pl/problems/TEST/) to get a feel for how the system would work.

From using BlogSpot for a blog before, I know that there isn't (or at least, wasn't) a good way to put source code into blog posts here, so I searched around and found Gist (https://gist.github.com/), which lets you format code snippets complete with syntax highlighting and provides an embed link so that you can show your code on any web page, which works really nicely for putting them directly into BlogSpot posts.

So I'm now using the fact that I solved the first problem on Sphere to try pasting a code snippet here. Feel free to comment on anything about this code. I'm still struggling to find the best way to do stream I/O in C++, since there are apparently a lot of ways to read data from streams and the best method varies by the situation.

..

GitHub

My source control of choice has been Mercurial for a long time, since it was built with cross-platform in mind, and it certainly feels that way to use it. It may not be as fast or as flexible as Git, but the fact that it works so smoothly on both Windows and Unix systems, and has a really clean, unified GUI option, as well as a complete CLI, has made it a good choice for a lot of school projects so far.

I have also used Git in the past, but only when working on projects that were started by other people. I hadn't yet set it up on my new computer and I had seriously forgotten how much of a pain in the butt it was to get Git set up on Windows. And even after the setup phase is complete, it's certainly not the easiest tool to use on Windows. That's fine, since I can use it without a problem when I'm remotely logged into my Linux account, or when I'm in the lab.

Still I feel like Git is a somewhat less intuitive tool to use. All the flexibility it offers has caused some of the commands at the core of the workflow to feel a bit clunkier and less intuitive than they could be.

At the end of the day, though, I know that a working knowledge of Git will be very powerful, so I'm going to look at this as a new challenge and really dive into Git this semester. Don't have much of a choice, either, since it's required!

Progress

So far I've set up the respository, edited the README (learned some simple markdown syntax for that), and created issues for all the requirements, and a few other items that came up as I was doing the above.

Issue Tracking

It bothers me that GitHub's issue tracker doesn't have a built-in concept of priority or severity, which is pretty integral to an issue-tracking system. I can see why so many open source projects on GitHub opt to use BugZilla or some other issue tracker.

I learned how to close issues via commit messages. I haven't had a chance to do so yet, but I'm looking forward to giving it a try.

Workflow

Luckily I have some experience with committing often and using an issue tracker, so this is one aspect of the required workflow for the class projects that won't feel utterly foreign to me. Unfortunately, I haven't worked the rest of it out yet so I've felt really cluttered while working on this class, so far.

Confusion

A few items have confused me about this project so far...

The requirements state that we're supposed to write unit tests before writing any code, but it seems that the grader's repository for the unit tests isn't up yet. I'm not sure whether I should be writing the unit test code, keeping it in my own repository, and then pushing it to the unit test repository later, especially since that would seem to make the most sense for actually running the unit tests on the code. That would also let me get started on the project sooner rather than later.
If we are to complete this project in C++, why are we provided with starter Java files on the project page?