Wednesday, April 30, 2014

Basic ETL Testing Interview Questions and Answers - Part 3


Basic Data Warehouse Testing Interview Questions and Answers - Part 3

Let's Continue discussing basic Conceptual ETL or Data Warehouse Interview Questions and answers...

=================================================================
Q43. How to delete one year older file in Unix?
A43. Find ./your_dir -mtime +365 -type f -delete
Note : f-> file only, -mtime +365 -> 365 days older
=================================================================
Q44. What's the difference between delete and truncate command?
A44 : Difference are given below -
a) Delete is dml statement (which requires commit after that) while Truncate is ddl statement (which is autocommit)
b) In Delete statement we can put conditions, so we can delete whatever required; while Truncate deletes all the data in tables ( we can't put conditions here)
c) Delete activates a trigger as the individual row deletions are logged, while truncate never activates a trigger (no logs maintained) that's why rollback is possible in Delete but not in Truncate.
d) Truncate tables always locks the table while Delete Statement uses rowlock before deleting the row.
e)Truncate is faster than delete as it doesn't keep the logs.
=================================================================
Q45.Write a query for the below -
To Find out the name of the employees whose salary is greater than their
manager salary.
A45 Query-
Select EMP.name from Employee Emp, Employee Mgr
where EMP.managerid = Mgr.empid
and Emp.sal > Mgr.sal ;
=================================================================
Q46. What is BPT (Business Process testing) ?
A46. BPT stands for business process testing. It enables SMEs to design test script early in the development cycle. It consist of reusable business components which got converted into business process test.
=================================================================
Q47. How can we add partitions in Teradata? Please tell me the query.
A47. Query -
Alter table table_name
modify primary index
Add range between date '2014-04-01' and date '2015-06-01' Each interval '1' Month;
=================================================================
Q48. What is multi-file system (MFS) ? How can we copy a mfs file?
A48. Multi-file system(MFS) is a file which is divided into different files . Using MFS file we can process a big input file in efficient manner as files are divided. So, we can process the divided files parallel and make the processing faster.
We can use the below commands to copy MFS file -
m_cp file1 file2
=================================================================
Q48. You worked on Informatica then tell me the difference between FILTER and ROUTER ?
A48. There are many differences between FILTER and Router, which are given below -
a) Filter transformation has a single i/p and single o/p while Router has single i/p and multiple o/p
b) Router is like Case statement in database while filter is like where clause in database
c) Router transformation doesn't block any row i.e no. of rows equal in i/p and o/p while filter block rows i.e no. of rows in o/p <= no. of rows in i/p
=================================================================
Q49. In Abinitio graph , how can we check DML and XFR for the transformation?
A49. Right Click on Transformation Properties -> go to file layout -> check path for dml
(or embedded DML)
Right Click on Transformation Properties -> check transformation logic -> check path for XFR(or embedded logic)
=================================================================
Q50. Are you aware of IBM Tivoli Scheduler..How can get the Job stream in plan ? How can we execute that job stream even though dependencies are there?
A50. IBM TIVOLI is used to schedule the job streams which calls the wrapper script/graph to perform the job.
Go to database of job stream and then submit the particular job stream from there; it will appear in the Plan. We can increase the priority to high(101) and release all the dependencies to run that stream.
=================================================================
Q51. What do you understand by full load and incremental load?
A52. Full load is basically dumping of entire data from source table to target table . Every time when full load is happening then we are truncating the table and loading the entire data again.

Incremental load is basically loading of delta data from source to target table in
regular interval so that source and target can be synchronized. while applying the delta
into target table, Normally we do capture data change (CDC) i.e what is getting inserted,
updated or deleted.

********************************************************************************************************