Saturday, October 26, 2013

Remote Debugging using Eclipse

   Your java application is running on a remote machine, something goes haywire and the only way to debug your application is through making changes to a piece of code at random and redeploying it multiple time rather than actually stepping through the code??
   Well here is a better solution to that problem. Remote Debugging using Eclipse.

1. Install the Remote System Explorer

2. Establish an ssh connection to the remote server using the Remote System Explorer plugin


3. Start the java application debug server in the remote location(say hostname.company.com) with -Xdebug, -Xrunjdwp VM argument
java -Xdebug -Xrunjdwp:transport=dt_socket,address=8001,server=y,suspend=y -jar

3. Right click on the eclipse project that you would like to hook it to Remote Debugger as in the picture


4. Right click on the Remote Java Application and select 'new'


5. Configure the connection details as shown below


6. Run the debug!
Happy bugging :)
To understand more about how it works.. Head over to
Java Debug Architecture
and
http://docs.oracle.com/javase/1.5.0/docs/guide/jpda/jdwp-spec.html

Thursday, October 24, 2013

Aspiring Software Architect!??

A good way to start is by reading about how huge systems are built.. So where do you start? Here is where.. http://aosabook.org/en/index.html

Dozer: Bean Mapper Toolkit!

Dozer is a Java Bean to Java Bean mapper
A neat tool to map the beans which uses XML based mapping!
http://dozer.sourceforge.net/

Wednesday, October 23, 2013

Flume-ng, Hello World Quickstarter Guide!

For a bare minimum flume implementation we need to have the following components
1. Client : Java Program : Generates event(s) based on the fluctuating source

2. Source : Java Program : Must extend 'AbstractSource'. It is used to interface with the client, where the source acts as a server which listens to the events generated by the client. Based on kind of behaviour expected the source can implement or extend one of the flume source's in the 'org.apache.flume.source.*' packages
 OutOfTheBox sources: Avro, Exec, NetCat, Sequence Generator, Syslog, Scribe

3. Channel : Java Program : Connects the Source and the Sink. Acts as an event conduit between Source and Sink. There are multiple implementations of the Channel that can be used out-of-the-box which could be found in 'org.apache.flume.channel'. Most common one is the 'memory' channel
 OutOfTheBox sources: Memory, JDBC, File

4. Sink : Java Program : Must extend 'AbstractSink'. It is used to collect the events coming out of a client and write it to file system Based on kind of behaviour expected the sink can implement or extend one of the flume sink's in the 'org.apache.flume.sink.*' packages
 OutOfTheBox sources: Avro, Logger, IRC, File, HBase

## To demonstrate how flume works. Following is the simplest example ##
##### flume-agent.conf #####
#Agent Definition
myagent.sources = mysource
myagent.channels = mychannel
myagent.sinks = mysink

#Channel Definition
myagent.channels.mychannel.type = memory
myagent.channels.mychannel.capactiy = 1000
myagent.channels.mychannel.transactionCapacity = 100

#Source Definition
myagent.sources.mysource.type = exec
myagent.sources.mysource.command = tail -F /user/shashi/Somefile.txt
myagent.sources.mysource.channels = mychannel

#Sink Definition
myagent.sinks.mysink.type = hdfs
myagent.sinks.mysink.hdfs.path =  /shashi
myagent.sinks.mysink.hdfs.fileType = DataStream
myagent.sinks.mysink.channel = mychannel
Flume command to get the agent started
##### flume command #####
#Command to start flume agent
flume-ng agent --conf-file flume-agent.conf --name myagent

Shell script to increment the file being tailed by the flume
max=100
for i in `seq 1 $max`
do
    echo "$i" >> /user/shashi/Somefile.txt
done

Want to know more about it? Head over to the flume wiki page! http://archive.cloudera.com/cdh4/cdh/4/flume-ng/FlumeUserGuide.html#configuration

Code link for custom flume components: http://goo.gl/u0M5Yj

Monday, October 21, 2013

Google Developer!

Every developer should compete in the dev challenge by Google at least once in a life time :)
https://developers.google.com/

https://developers.google.com/apis-explorer/#p/

Google Dictionary API

Power packed evening with the leaders of India Inc!


Apache TIKA: Content Analysis Toolkit

Apache TIKA - is a very powerful tool for analyzing documents of almost any format!
http://tika.apache.org/