Thursday 21 September 2017

Hive Date Magic

In this post, I will share different scenarios of date handling in hive.

1- Use below code for Converting julian date to calender date

substr(from_unixtime(unix_timestamp(cast(concat('2017001') AS string),'yyyyDDD')),1,10) AS trans_dt

Hive Optimization - Multi insert

check below link

http://tech-terminologies.blogspot.com/2018/07/hive-optimization-using-multi-insert.html

Monday 1 May 2017

AWS Dynamo DB

AWS Dynamo DB  Key value pair it is a NoSql database.

How Run Dynamo db on Local 

You can download dynamo db  jar from from the below table 

RegionDownload LinksChecksums
Asia Pacific (Mumbai) Region
Asia Pacific (Singapore) Region
Asia Pacific (Tokyo) Region
EU (Frankfurt) Region
South America (São Paulo) Region
US West (Oregon) Region


Now you can run dynamo db as service on you local machine and start development . 

Use this command to run dynamodb on local 

java -Djava.library.path=./DynamoDBLocal_lib -jar DynamoDBLocal.jar -sharedDb

Dynamo db Service will available on http://localhost:8000

How to read data from Dynamo db 



  1. Download SQLite.exe from download SQLite.
  2. Open "DB Browser for SQLite.exe" 
  3. Browse "shared-local-instance.db" file from your dynamo db directory.  
  4. Now you can view data and tables. 
  5. It is the same as we are using other DB client.
  6. only the difference is NoSQL data base.

Dynamo DB Data models

DynamoDB uses three basic data model units, Tables, Items, and Attributes. Tables are collections of Items, and Items are collections of Attributes.


Attributes are basic units of information, like key-value pairs.  tables do not have fixed schemas associated with them. Items are like rows in an RDBMS table, except that DynamoDB requires a Primary Key. The Primary Key in DynamoDB must be unique so that it can find the exact item in the table. DynamoDB supports two kinds of Primary Keys:
  • Hash Type Primary Key: If an attribute uniquely identifies an item, it can be considered as Primary. DynamoDB builds a hash index on the attribute to facilitate the uniqueness. A Hash Key is mandatory in a DynamoDB table.

  • Hash and Range Type Primary Key: This type of Primary Key is built upon the hashed key and the range key in the table: a hashed index on the hash primary key attribute, and a range sort index on the range primary key attribute. This type of primary key allows for AWS’s rich query capabilities.

for data base design make sure your your using hashKey and rangeKey as per your schema. 

HashKey is the search point : and under the hash key you can use different different range that is called range key . 

Query on dynamo db is always cheaper than Scan it terms of cost and performance . 

Example design: of Call center call detail 

1. employeeNo- HashKey 
2 Call date : Rangekey (Epoch seconds ) 

you can not store date in dynamo db , only epoch seconds can be store if you want search against the date .

sample data 
Empno:123, calldate:1490203419
Empno:123 calldate:1490203420

dynamo db allow unique hash key and range key in table .