Monday, May 13, 2019

Azure Data Factory: Lookup and foreach example

From the data-side we are having view either in on premise SQL DB/ Azure SQL db.
From that view we'll be creating a SP say GetSalesInfo that is fetching the records.
===================================================
Alter proc demo.GetSalesInfo @ProdCat nvarchar(100)
AS
BEGIN
select h.OrderDate,h.PurchaseOrderNumber,MAX(h.SubTotal) Amount
from SalesLT.SalesOrderHeader h
inner join SalesLT.SalesOrderDetail d
on h.SalesOrderID = d.SalesOrderID
inner join SalesLT.Product p
on d.ProductID = p.ProductID
inner join SalesLT.ProductCategory pc
on pc.ProductCategoryID = p.ProductCategoryID
and pc.Name = @ProdCat
where YEAR(h.OrderDate)= 2008 and h.PurchaseOrderNumber is not null
group by h.OrderDate, h.PurchaseOrderNumber
end
go
=====================================================================
We also need to create a table and sp for logging
=====================================================================
create table demo.ETLLog
( id int identity (1,1) primary key,
DateProcessed datetime default(getdate()),
ProductCategory nvarchar(100) not null,
duration int not null)

Create or ALTER proc demo.updateETLLog
@ProdCat nvarchar(100), @duration int
as
begin
insert into demo.ETLLog(ProductCategory,duration)
values(@ProdCat ,@duration)
end
go


And we require one SP to fetch the lookup data as below

ALTER proc demo.GetProductCategry
AS
BEGIN
select productcategoryid, Name as Category
from SalesLT.ProductCategory
END

In ADF First you need to create a pipeline and then drag
a Lookup from the left pan of the designer.




See the highlighted part for the
Lookup where the category
is coming from the SP enabling
looking thru those category via
Foreach loop















To use from the Lookup value
by foreach loop, inside setting
of the foreach loop we need to
use the below formula inside
the setting tab under item.
Use Foreach as sequential mode

@activity('LookupCategories').output.value


if you double click the foreach loop then you'll be redirecting to the below screen
where you need to add a copy control first





























The input parameter if the SP to be supplied from the parameter value
@item().Category
Please note Category is one of the output of the SP demo.GetProductCategry



The sink here is destination.
When you'll go to the tab for
any data-set sink or source,
you'll be having two options
Choosing from the existing or
creation of a new.
I created the new data-set and
then I need to create the link
server
One is for source i.e. Azure SQL DB
For that you need to remember
the SQL credential i.e. UID and
pwd of Azure SQL.

For ADLS you need to add the ADF
pipeline name inside the access
tab
of ADLS data explorer and use the
manage identity option.
I'm not explaining the process for ADLS security not a great extent.
Anyways if you get any security error while testing connection for
ADLS please use the below link

For dynamic creation of files based on category inside folder of ADLS
Use the dynamic file creation
@concat('sales_',item().Category,'.txt')

Thursday, April 11, 2019

My study plan and experience for AWS Certified Solution Architect examination

First I apologies though I like to keep this post short but eventually it became a long post.

Before going into the details I like to tell I'm having 10+ years of experience and although before the preparation I've very little know-how about AWS but I had some working knowledge with Azure in data side i.e. Azure SQL DW, ADF ADLS hence I don't need to spend time to understand what is cloud why it's been used etc.
Lets come to my plan from the beginning - it took me around 4 month as my study  being interrupted.

First I created an account to Udemy and go thru couple of free courses given by Linux academy and one by backspace.

Then I searched and find the reviews for the below  course is decent AWS Certified Solutions Architect - Associate [New Exam] Created by DolfinEd .

The reason I took this course apart from feedback is that the content it covers and I don't want a short cut here. Getting a certificate is not my only target rather implement the knowledge in real life project is my aim (sorry for some GYAN :-)

Nevertheless I go thru the course religiously and find though it's lengthy but most of the topic are covered in depth. For the sheer length some of the topic I skipped like DW architecture, understandings or Database design or NoSQL introduction. I feel DAX is missing for Dynamo DB and ALB topic should be better. There should be a brief lecture for CodeDeploy, Code Pipeline etc. and as well as Athena what is it, what it does.

Another thing I didn't like about this course is instead of the quiz questions after each topic they should have small quiz section after each and remaining questions can be put into couple of practice tests. Let me explain why 'm  critical on this. Lets say a question comes after Dynamo DB in the quiz section so  your mind start guessing that the answer should be Dynamo DB. This guessing without understanding why the others can't be the best answer will not help.

Therefore I bought Whizlab test series where all the exam my score was 85%.

Then in group I saw people are recommending about Neil Davis question bank which is ful of tricky questions. There I got just 70 to 73% to all tests barring test 5 where I got fail :-(
(The pass mark for Neil Davis is 70%)

And another thing I like to share that due to my intense work pressure I can't repeat any of the examination but try to understand while reviewing why my answer is incorrect.
After the test i spend at east 1.5 to 2 hours for a particular set and raised question to the owner if I have any doubt about the answer.
Let me give an example from Neil Davis question
So according to me this kind of arguments should be in your mind as you'll hardly get any common questions in your exam. So in real exam you need to differentiate why the other options can't be correct.

Before the topics from where I got the questions in exam I'd like to add two more points.
1. Lots of people I know are complaining they don't have AWS real life experience hence they feared it'd be a road block to the exam. I have real life AWS experience but I worked mainly on server-less part Glue, Lambda, API Gateway. My DB work on majorly to build a data-warehouse in  Redshift and for real time streaming I worked on Kinesis . None of these topics are carrying high weightage in the actual exam, at least the question bank against which I've been tested! I got one simple Redshift question for which you only need to know the use cases of any Data warehouse like what kind of data they are storing. Same for Kinesis and 2 in lambda. So my work brings 5 sure points and without that also anyone can score those straight forward 5 questions! Believe me.
2. I'm an avid reader this helped me a lot. I finished the exam in less than 105 minutes. If you read faster it'll help. I know this is not an English exam but for me quite a few questions are lengthy and for some question which is 2 liner , each options were about 3 to 4 lines :-((

Now the topics what I recollected from exam.
1. ASC -- what's the minimum number of instance you need to have so even an AZ down you still can survive with 6 instances (simple arithmetic)
2. Few questions of ALB (host based , path based) and one for NLB
3. AWS Athena
4. SG vs NACL
5. DAX & other use cases differentiation between Dynamo DB and RDS
6 Cloud front
7. Elastic Cache
8. EFS (quite a few)
9. Route 53
10. AWS AD directory, Ad connector , simple AD use cases
11. SQS
12. Instance type
13. EBS type - which to use when kind of?
14. S3.
15. Redshift, Kinesis, Lambda

I didn't get any question which looked for choose 3 options- 20+ questions are about choosing 2 options though!

Please feel free to post in the comment section if anyone wants to know more. I'll answer as per my best knowledge. Good luck!

Tuesday, March 12, 2019

My experience for paying road tax and car re-registration in Bangalore RTO Kasturi Nagar

It's since a long time I wrote the last post. The intent of writing this post the way it's an utter frustration for me - hence I don't want any other person who is going to change the registration or , paying the road tax for outstation car , should be facing the same.
First if you only want to pay the road tax then you need to have the actual invoice copy, your proof of residence (RTO help-desk told they only accept voter ID, ADHAAR or BSNL bill or water tax) and insurance and pollution certificate.
If you want to manage any lack of documentation you can contact an agent but that'll cost additional money.
The first issue is that it's a government office and the help desk of RTO Kasturi Nagar is worst. If you ask any question they'd be looking as if you're asking some money from them :-)
First tip : try to go mid of weekdays like TUES, WED or THURS - the crowd will be little lesser.
Now for me the challenge was that I am changing or doing the re-registration of my vehicle hence NOC was also required. The required doc inclusive both re-registration and road tax:

  • NOC
  • Insurance paper
  • Pollution or emission certificate
  • Form 27 -CMV'
  • Form 27 -KMV
  • Form 33
  • Form 14

From the last form you need to collect SRMB report for tax calculation for your vehicle. The lady in counter 6/7 is so rude she can't help you anything - later we came to know for that you need to go to the Infantry road to collect. And to avoid the hassle you can give 500 rupee to an agent who can collect the report and give it to you.
After that RTO commissioner's signature is required and then the most dreadful part the inspection. That guy is a demon :-)
The whole RTO is willing to play a badminton game where you're the shuttle cork
As my office is on the other part of the city eventually I got hold of an agent to do the thing but if you have some time and patience you may try to do by yourself.

FEEDJIT Live Traffic Feed