Apache Mahout – Machine Learning Libraries

Very cool libraries with numerous machine learning algorithms implemented.  Also, a new book from Manning on the topic was just released.

Posted in Uncategorized | Leave a comment

Use vi as a hex editor

Yesterday I had the pleasure of using vi to work with the Hadoop edit log after it became corrupted.   In the process, I also discovered a gotcha…

As always, make a backup of the target file before making edits.

To do this, open a file in vi as you would normally.  Once it opens, press the escape key and type the below followed by enter:
:%!xxd

After you have made your changes, press the escape key again and type the below followed by enter (this will exit from hex mode):
:%!xxd -r

It is imperative that you exit hex mode before saving!  Otherwise, the actual hex information being displayed will be saved (including line numbers, hex data, and ascii data) which isn’t what you want.

Posted in Uncategorized | Leave a comment

BiblioLabs Comcast Business Class Commercial

I was browsing through some files from last year and found this fun video.

BiblioLabs Comcast Commercial

Posted in Uncategorized | Leave a comment

The graveyard is full of people who are indispensible

I heard a quote earlier tonight that made an impression.  It was “the graveyard is full of people who are indispensible.”  Good advice.

Posted in Uncategorized | Leave a comment

Configure Mule ESB to use more than 4GB of memory

The other day I ran into a problem while trying to enable Mule to use more than 4GB of memory despite being on 64-bit hardware and OS.  It was a bit of a pain to figure out since a few key steps aren’t intuitive.

First, the Mule installation is at:

/usr/java/apps/mule-standalone-2.2.1

All commands below are provided relative to this base directory.

The top of ./conf/wrapper.conf should look like this:

#********************************************************************
# System Properties
#********************************************************************
# Location of your Mule installation.
wrapper.java.additional.1=-Dmule.home="%MULE_HOME%"
wrapper.java.additional.1.stripquotes=TRUE
wrapper.java.additional.2=-Dmule.base="%MULE_BASE%"
wrapper.java.additional.2.stripquotes=TRUE
wrapper.java.additional.3=-server
wrapper.java.additional.4=-XX:MaxPermSize=512m
wrapper.java.additional.5=-XX:+CMSPermGenSweepingEnabled
wrapper.java.additional.6=-XX:+CMSClassUnloadingEnabled
wrapper.java.additional.7=-Xmx35840m

Next scroll down to find a line similar to the following.  Ensure that it’s set to 0

# Maximum Java Heap Size (in MB)
wrapper.java.maxmemory=0

At this point, I restarted Mule and assumed that all would work well, but that didn’t turn it to be the case.  Unfortunately, Mule still defaults to the 32-bit wrapper and only uses 4GB of addressable memory despite more being available.  In order to fix this you need to disable the 32-bit wrapper by executing:

chmod a-x ./lib/boot/exec/wrapper-linux-x86-32

Now restart Mule and confirm that the new changes have taken effect by executing:

ps aux | grep mule

From the results, look for the items that are in red color below.

root     24917  0.2  0.0  17024   784 ?        Sl   18:00   0:09 
/usr/java/apps/mule-standalone-2.2.1/lib/boot/exec/wrapper-linux-x86-64
/usr/java/apps/mule-standalone-2.2.1/conf/wrapper.conf
wrapper.syslog.ident=mule wrapper.pidfile=...
root     24919 96.2 58.6 39278964 14500884 ?   Sl   18:00  60:53
java -Dmule.home=/usr/java/apps/mule-standalone-2.2.1
-Dmule.base=/usr/java/apps/mule-standalone-2.2.1 -server
-XX:MaxPermSize=512m -XX:+CMSPermGenSweepingEnabled
-XX:+CMSClassUnloadingEnabled -Xmx35840m -Djava.endorsed.dirs=...
Posted in Technology | Leave a comment

College of Charleston Alumni Symposium

Last week I had the opportunity to speak to current CofC CS students about lessons that I have learned though my experience at various startups as part of an alumni symposium.  I primarily spoke about the lessons learned at BookSurge and BiblioLabs since those two experiences were, and continue to be, the most instructive.  The session was very informal, designed for the benefit of the students and well executed.

One of the main points of my presentation was the fact that a primary determinant of success is the individual level of engagement and genuine interest in a given field rather than being able to list an alphabet soup of technologies on their resume.  I believe this point was effectively made as I had three students come to the Java Users Group later that night, a few internship applications and a resume submitted for a full time position.  It’s very encouraging to see that level of interest and engagement from students and equally encouraging to see initiative being taken by the College to create  connections between students and industry.  Thanks to all who were involved.  Here is a copy of the College of Charleston Computer Science Alumni Symposium 2011 Presentation.

Posted in Technology | 1 Comment

UML Editor for Mac OS X – astah

After trying many poor UML editors for Mac OS, I finally found one that is excellent and also has a community edition.  The editor is Astah and it has a wide variety of useful features and the IDE is pretty intuitive.   The paid version has support for code generation in several languages and many other advanced features.  However, the community version allows design of the entire model and export to an image file for distribution and feedback.  Once I finish the first part, I intend to purchase the full version to create the Java code and will post the results of that soon.

Posted in Technology | Leave a comment

BookSurge Lessons Learned – Part 1

During the formation of BookSurge, I learned many valuable lessons that I now feel should be incorporated into the fabric of any startup from the first day.  While these may seem basic to any mature organization, they can be overlooked very easily when a startup consists of one or two people writing code and very little communication overhead exists.  The essentials are to:

  • Standardize development and deployment process
    • Standardization in these two processes is of utmost importance because without them you cannot reliably release your product.
  • Setup centralized documentation and issue management systems
    • Even for a team of one person, these tools are invaluable for tracking the plethora of issues that will inevitably arise over time.  There are many open-source options as well as commercial products that are often times worth the money.
  • Employ rapid development cycles
    • This isn’t always the right fit, but in most cases it will be appropriate for a startup.  The key is to improve the code base using rapid development cycles and correct bugs as you go.
  • Choose the right tools for the job
    • This can be difficult for some because the right tool is instinctively always what you know best.  Try to perform some basic research to see if the technology stack you intend to use is well suited for the development tasks at hand.  Also, make sure the technology stack has abstractions at the proper level of granularity to make the tasks as natural as possible.
  • Have a continuous build process in place
    • Use Hudson, CruiseControl or whatever is available for your development environment, but the point is that you should have an automated build process that runs regression testing as you incrementally develop code.
  • Commit to metrics driven testing
    • This applies to both unit and integration testing.  Almost all languages have some implementation of a framework for unit testing and there are also many frameworks for writing integration tests.

These are a few of the lessons that I learned through my BookSurge startup experience.  More lessons to follow soon.

Posted in Technology | 1 Comment