Geek Languages

Wednesday 19 October 2016

Learn Hadoop

Apache Hadoop

Apache Hadoop is, an open-source software framework, written in Java, by Doug Cutting and Michael J. Cafarella, that supports data-intensive distributed licensed under the Apache v2 license. It supports of applications on large clusters of commodity hardware. Hadoop was derived from Google's MapReduce and Google File System (GFS) papers.

The name "Hadoop" was given by Doug Cutting's, he named it after his son's toy elephant. Doug used the name for his open source project because it was easy to pronounce and to Google.The Hadoop framework transparently provides both reliability and data motion to applications. Hadoop implements a computational paradigm named MapReduce, where the is divided into many small of work, each of which may be executed or re-executed on any node in the cluster. It provides a distributed file system that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both reduce and the distributed file system are designed so that node failures are automatically handled by the framework. It enables applications to work with thousands of computation-independent computers and petabytes of data. The entire Apache Hadoop platform is commonly considered to consist of the Hadoop kernel, MapReduce and Hadoop Distributed File System (HDFS), and number of related projects including Apache Hive, Apache HBase, Apache Pig, Zookeeper etc.

Before you start proceeding with this hadoop, you should have prior exposure to Core Java, database concepts, and any of the Linux operating system flavors.

Monday 17 October 2016

Overview of Big Data

Ratul Arora 13:45:00 1 Comments

BIG DATA

Big data analytics is the process of examining large data sets containing a variety of data types -- i.e., big data -- to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information. The analytical findings can lead to more effective marketing, new revenue opportunities, better customer service, improved operational efficiency, competitive advantages over rival organizations and other business benefits.

90% of the world’s data was generated in the last few years.

What is Big Data?

Big data means really a big data, it is a collection of large datasets that cannot be processed using traditional computing techniques. Big data is not merely a data, rather it has become a complete subject, which involves various tools, technqiues and frameworks.

What Comes Under Big Data?

Big data involves the data produced by different devices and applications. Given below are some of the fields that come under the umbrella of Big Data.

Black Box Data : It is a component of helicopter, airplanes, and jets, etc. It captures voices of the flight crew, recordings of microphones and earphones, and the performance information of the aircraft.
Social Media Data : Social media such as Facebook and Twitter hold information and the views posted by millions of people across the globe.
Stock Exchange Data : The stock exchange data holds information about the ‘buy’ and ‘sell’ decisions made on a share of different companies made by the customers.
Power Grid Data : The power grid data holds information consumed by a particular node with respect to a base station.
Transport Data : Transport data includes model, capacity, distance and availability of a vehicle.
Search Engine Data : Search engines retrieve lots of data from different databases.

Benefits of Big Data

Using the information kept in the social network like Facebook, the marketing agencies are learning about the response for their campaigns, promotions, and other advertising mediums.
Using the information in the social media like preferences and product perception of their consumers, product companies and retail organizations are planning their production.
Using the data regarding the previous medical history of patients, hospitals are providing better and quick service.

Big Data Challenges

The major challenges associated with big data are as follows:
Capturing data
Curation
Storage
Searching
Sharing
Transfer
Analysis
Presentation

Saturday 8 October 2016

What is final, finally and finalize?

Ratul Arora 12:04:00 0 Comments

final:

final is a keyword. The variable decleared as final should be

initialized only once and cannot be changed. Java classes

declared as final cannot be extended. Methods declared as final

cannot be overridden.

finally:

finally is a block. The finally block always executes when the
try block exits. This ensures that the finally block is executed
even if an unexpected exception occurs. But finally is useful for
more than just exception handling - it allows the programmer to
avoid having cleanup code accidentally bypassed by a return,
continue, or break. Putting cleanup code in a finally block is
always a good practice, even when no exceptions are anticipated.

finalize:

finalize is a method. Before an object is garbage collected, the
runtime system calls its finalize() method. You can write system
resources release code in finalize() method before getting garbage
collected.

Thursday 1 September 2016

Map Reduce Program of Wordcount

Ratul Arora 13:14:00 0 Comments

package wordcount;

import java.io.IOException;
import java.util.Iterator;
import java.util.StringTokenizer;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;
import org.apache.hadoop.mapred.TextInputFormat;
import org.apache.hadoop.mapred.TextOutputFormat;

public class WordCount
{
public static class Map extends MapReduceBase implements
            Mapper<LongWritable, Text, Text, IntWritable>
{

public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter)
            throws IOException
        {
String line = value.toString();
            StringTokenizer tokenizer = new StringTokenizer(line);

            while (tokenizer.hasMoreTokens())
           {
                value.set(tokenizer.nextToken());
                output.collect(value, new IntWritable(1));
           }

      }
    }

    public static class Reduce extends MapReduceBase implements
            Reducer<Text, IntWritable, Text, IntWritable>
    {
        public void reduce(Text key, Iterator<IntWritable> values,
                OutputCollector<Text, IntWritable> output, Reporter reporter)
                throws IOException
        {
         int sum = 0;
         while (values.hasNext())
           {
               sum += values.next().get();
            }

            output.collect(key, new IntWritable(sum));
       }
    }

    public static void main(String[] args) throws Exception

{     JobConf conf = new JobConf(WordCount.class);                                                conf.setJobName("wordcount");

        conf.setOutputKeyClass(Text.class);
        conf.setOutputValueClass(IntWritable.class);

        conf.setMapperClass(Map.class);
        conf.setReducerClass(Reduce.class);

        conf.setInputFormat(TextInputFormat.class);
          conf.setOutputFormat(TextOutputFormat.class);

        FileInputFormat.setInputPaths(conf, new Path(args[0]));
        FileOutputFormat.setOutputPath(conf, new Path(args[1]));

       JobClient.runJob(conf);

    }
}

Tuesday 16 August 2016

HDFS Commands

Ratul Arora 15:17:00 1 Comments

<<<<<<COMMANDS>>>>>>>>>

hadoop fs ls:

The hadoop ls command is used to list out the directories and files. An example is shown below:

$./hadoop fs -ls input/
Found 1 items
drwxr-xr-x   - hadoop hadoop 0 2013-09-10 09:47 /input/abc.txt

-------------------------------------
hadoop fs lsr:

The hadoop lsr command recursively displays the directories, sub directories and files in the specified directory. The usage example is shown below:

$./hadoop fs -lsr /user/hadoop/dir
Found 2 items
drwxr-xr-x   - hadoop hadoop 0 2013-09-10 09:47 /user/hadoop/dir/products
-rw-r--r--   2 hadoop hadoop    1971684 2013-09-10 09:47 /user/hadoop/dir/products/products.dat

------------------------------------
hadoop fs cat:

Hadoop cat command is used to print the contents of the file on the terminal. The usage example of hadoop cat command is shown below:
EX:
hadoop fs -cat input/abc.txt

-------------------------------
hadoop fs chmod:

The hadoop chmod command is used to change the permissions of files. The usage is shown below:
SYNTAX:
hadoop fs -chmod <octal mode> <file or directory name>
EX:
$./hadoop fs -chmod 700 input/abc.txt

--------------------------------------------
hadoop fs chown:

The hadoop chown command is used to change the ownership of files. The usage is shown below:
SYNTAX
hadoop fs -chown <NewOwnerName> <file or directory name>
EX:
$./hadoop fs -chown hadoop input/abc.txt
---------------------------------------
hadoop fs mkdir:

The hadoop mkdir command is for creating directories in the hdfs. You can use the -p option for creating parent directories. This is similar to the unix mkdir command. The usage example is shown below:

$./hadoop fs -mkdir -p input/

The above command creates the input directory in /user/ratul directory.
-------------------------------
hadoop fs copyFromLocal:

The hadoop copyFromLocal command is used to copy a file from the local file system to the hadoop hdfs. The syntax and usage example are shown below:

Syntax:
hadoop fs -copyFromLocal <source> <destination>

Example:

Check the data in local file
> cat sales.txt
2000,iphone
2001, htc

Now copy this file to hdfs

$./hadoop fs -copyFromLocal /home/ratul/sales.txt input/

View the contents of the hdfs file.

$./hadoop fs -cat input/sales.txt
2000,iphone
2001, htc
-----------------------------------
hadoop fs copyToLocal:

The hadoop copyToLocal command is used to copy a file from the hdfs to the local file system. The syntax and usage example is shown below:
SYNTAX
hadoop fs -copyToLocal <source> <destination>
EX:
$./hadoop fs -copyToLocal input/sales.txt /home/ratul/

---------------------------
hadoop fs cp:

The hadoop cp command is for copying the source into the target. The cp command can also be used to copy multiple files into the target. In this case the target should be a directory. The syntax is shown below:
SYNTAX
>hadoop fs -cp <source> <destination>
EX:
$./hadoop fs -cp input/sales.txt new/

----------------------------
hadoop fs put:

Hadoop put command is used to copy multiple sources to the destination system. The syntax for the put command are shown below:

Syntax1: copy single file to hdfs

>./hadoop fs -put home/ratul/abc.txt input/

Syntax2: copy multiple files to hdfs

>./hadoop fs -put home/ratul/abc.txt home/ratul/qwerty.txt /new_folder

---------------------------------
hadoop fs get:

Hadoop get command copies the files from hdfs to the local file system. The syntax of the get command is shown below:
SYNTAX:
hadoop fs -get <source_from_hdfs> <destination_to_local>
EX:
$./hadoop fs -get input/abc.txt /home/ratul/
-------------------------------------
hadoop fs moveFromLocal:

The hadoop moveFromLocal command moves a file from local file system to the hdfs directory. It removes the original source file. The usage example is shown below:
SYNTAX:
hadoop fs -moveFromLocal <source_from_local> <destination_to_hdfs>
EX:
$./hadoop fs -moveFromLocal /home/ratul/abc.txt input/

-------------------------------
hadoop fs mv:

It moves the files from source hdfs to destination hdfs. Hadoop mv command can also be used to move multiple source files into the target directory. The syntax is shown below:
SYNTAX:
hadoop fs -mv <SrcFile> <destinationFile>
EX:
$./hadoop fs -mv input/abc.txt input/a/

----------------------
hadoop fs du:

The du command displays aggregate length of files contained in the directory or the length of a file in case its just a file. The syntax and usage is shown below:

$./hadoop fs -du abc.txt
------------------------------
hadoop fs rm:

Removes the specified list of files and empty directories. An example is shown below:

$./hadoop fs -rm input/file.txt
--------------------------------
hadoop fs -rmr:

Recursively deletes the files and sub directories. The usage of rmr is shown below:
$./hadoop fs -rmr input/folder/
-------------------------------------
hadoop fs setrep:

Hadoop setrep is used to change the replication factor of a file.

Example:
$./hadoop fs -setrep - 3 /input/abc.txt

---------------------------------
hadoop fs stat:

Hadoop stat returns the stats information on a path. The syntax of stat is shown below:
EX:
$./hadoop fs -stat /input/abc.txt
2013-09-24 07:53:04
----------------------------
hadoop fs tail:

Hadoop tail command prints the last 10 lines of the file.
$./hafoop fs -tail /user/hadoop/abc.txt

12345 abc
2456 xyz
---------------------
hadoop fs text:

The hadoop text command displays the source file in text format. The syntax is shown below:
SYNTAX:
hadoop fs -text <src>
EX:
$./hadoop fs -text input/abc.txt

----------------------------------------
hadoop fs touchz:

The hadoop touchz command creates a zero byte file. This is similar to the touch command in unix. The syntax is shown below:
SYNTAX:
$./hadoop fs -touchz /input/aaa.txt

Wednesday 13 July 2016

Validation in Rails

Ratul Arora 14:43:00 0 Comments

class Person < ApplicationRecord
validates :name, presence: true
end

OR

class Person < ApplicationRecord
validates :name, :login, :email, presence: true
end

class Person < ApplicationRecord
validates :terms_of_service, acceptance: true
end

class Person < ApplicationRecord
validates :email, confirmation: true
end

class Product < ApplicationRecord
validates :legacy_code, format: { with: /\A[a-zA-Z]+\z/,
message: "only allows letters" }
OR a-z,A-Z,0-9
end

class Person < ApplicationRecord
validates :name, length: { minimum: 2 }
validates :bio, length: { maximum: 500 }
validates :password, length: { in: 6..20 }
validates :registration_number, length: { is: 6 }
end

class Player < ApplicationRecord
validates :points, numericality: true
end

class Account < ApplicationRecord
validates :email, uniqueness: true
end

Monday 4 July 2016

Rails Helper Tags

Ratul Arora 15:02:00 1 Comments

TAGS:

ERB tags                   <%    %>
print ERB tags          <%= %>
print ERB comment <%# %>
if block                <% if %>...<% end %>
if / else               <% if %>...<% else %>...<% end %>
else tag     else       <% else %>
elsif tag     elsif    <% elsif %>
end block     end        <% end %>
link_to helper         <%= link_to ..., ... %>
form_for helper     form      <%= form_for(@) do %>

Helpers:

   Form Component     Output Code Snippet

   f.submit       <%= f.submit "Submit" %>
   f.password_field          <%= f.password_field :attribute %>
   f.text_area                   <%= f.text_area :attribute %>
   f.check_box                 <%= f.check_box :attribute %>
   f.label                          <%= f.label :attribute, "Attribute" %>
   f.text_field       <%= f.text_field :attribute %>
   f.file_field                    <%= f.file_field :attribute %>
   f.hidden_field              <%= f.hidden_field :attribute %>

Geek Languages

Wednesday 19 October 2016

Learn Hadoop

Apache Hadoop

Monday 17 October 2016

Overview of Big Data

BIG DATA

What is Big Data?

What Comes Under Big Data?

Benefits of Big Data

Big Data Challenges

Saturday 8 October 2016

What is final, finally and finalize?

Thursday 1 September 2016

Map Reduce Program of Wordcount

Tuesday 16 August 2016

HDFS Commands

Wednesday 13 July 2016

Validation in Rails

Monday 4 July 2016

Rails Helper Tags

Social

Popular

Blog Archive

Feedback

About Me

Popular Posts

My Profile

check me on google..!!

Recent Post

Tags