1. How can we run the graph? What is the procedure for that? How can we schedule the graph in UNIX?

If you want to run the graph through GDE then after save the graph just press F5 button of your keyboard, it will run automatically. If you want to run through the shell script then you have to fire the command at your UNIX box.

2. What is a real-time data warehouse? How is it different from near to real-time data warehouse?

As the term suggests, a real-time data warehouse is a system, which reflects all changes to its sources in real time. As simple as it sounds, this is still an area of active research in the field. In traditional DWH, the operational system(s) are kept separate from the DWH for a good reason. The Operational systems are designed to accept inputs or changes to data regularly, hence have a good chance of being regularly queried. On the other hand, a DWH is supposed to do just the opposite - it is used to query data for reports only. No changes to data, through user actions is expected (or designed). The only inputs could come from the ETL feed at stipulated times. The ETL would source its data from the Operational systems just explained above.
To create a real-time DWH we would have to merge both systems (several ways are being explored), a concept that is against the reason of creating a DWH. Bigger challenges occur in terms of updating aggregated data in facts at real time, still maintaining the surrogate keys. Besides, we would need lightening fast hardware to try this.Near Real time DWH is a trade-off between the conventional design and the dream of all clients today. The frequency of ETL updates in higher in this case for e.g. once in 2 hours. We can also analyze and use selective refreshes at shorter time intervals, while complete refreshes may still be kept further apart. Selective refreshes would look at only those tables that get updated regularly.

3. What is difference between drill & scope of analysis?

Drilling can be done in drill down, up, through, and across; scope is the overall view of the drill exercise.

5. For faster process, what we will do with the Universe?

For a faster process create aggregate tables and write better sql so that the process would fast.

6. What is type 2 version dimension?

Version dimension is the SCD type II in real time it using because of it will maintain the current data and full historical data.

7. What is unit testing?

The Developer created the mapping that can be tested independently by the developer individually.

8. What is Informatica Architecture?

Informatica Architecture contains Repository, Repository server, Repository server administration console, sources, repository server and Data warehousing and it have the Designer, Work for manager, work for monitor combination of all these are called Informatica Architecture.

9. What is data warehouse architecture?

Data warehousing is the repository of integrated information data will be extracted from the heterogeneous sources. Data warehousing architecture contains the different; sources like oracle, flat files and ERP then after it have the staging area and Data warehousing, after that it has the different Data marts then it have the reports and it also have the ODS - Operation Data Store. This complete architecture is called the Data warehousing Architecture.

10. What is data analysis? Where it will be used?

Data analysis: consider that you are running a business and u store the data of that; in some form say in register or in a comp and at the year end you want know the profit or loss then it called data analysis .Data analysis use: then u want to know which product was sold the highest and if the business is running in a loss then finding, where we went wrong we do analysis.

11. What are data modeling and data mining? Where it will be used?

Data modeling is the process of designing a data base model. In this data model data will be stored in two types of table fact table and dimension table

Fact table contains the transaction data and dimension table contains the master data. Data mining is process of finding the hidden trends is called the data mining.

12. What is "method/1"?

Method 1 is system develop lifecycle create by Arthur Anderson a while back.

13. After the generation of a report to whom we have to deploy or what we do after the completion of a report?

The generated report will be sent to the concerned business users through web or LAN.

14. After the complete generation of a report who will test the report and who will analyze it?

After the completion of reporting, reports will be sent to business analysts. They will analyze the data from different points of view so that they can make a proper business decisions.

15. Can you pass sql queries in filter transformation?

We cannot use sql queries in filter transformation. It will not allow you to override default sql query like other transformations (Source Qualifier, lookup)

16. Where the Data cube technology is used?

A multi-dimensional structure called the data cube. A data abstraction allows one to view aggregated data from a number of perspectives. Conceptually, the cube consists of a core or base cuboids, surrounded by a collection of sub-cubes/cuboids that represent the aggregation of the base cuboids along one or more dimensions. We refer to the dimension to be aggregated as the measure attribute, while the remaining dimensions are known as the feature attributes.

17. How can you implement many relations in star schema model?

Many-many relations can be implemented by using snowflake schema .With a max of n dimensions.

18. What is critical column?

Let us take one ex: Suppose 'XYZ' is customer in Bangalore, he was residing in the city from the last 5 years, in the period of 5 years he has made purchases worth of 3 lacs. Now, he moved to 'HYD'. When you update the 'XYZ' city to 'HYD' in your Warehouse, all the purchases by him will show in city 'HYD' only. This makes warehouse inconsistent. Here CITY is the Critical Column. Solution is use Surrogate Key.

19. What is the main difference between star and snowflake star schema? Which one is better and why?

If u have one to may relation ship in the data then only we choose snowflake schema, as per the performance-wise every-one go for the Star schema. Moreover, if the ETL is concerned with reporting means choose for snowflake because this schema provides more browsing capability than the former schema.

20. What is the difference between dependent data warehouse and independent data warehouse?

Dependent departments are those, which depend on a data ware to for their data.Independent department are those, which get their data directly from the operational data sources in the organization.

22. What is Virtual Data Warehousing?

A virtual or point-to-point data warehousing strategy means that end-users are allowed to get at operational databases directly using whatever tools are enabled to the "data access network"

23. What is the difference between metadata and data dictionary?

Meta data is nothing but information about data. It contains the information (i.e. data) about the graphs, its related files, abinitio commands, server information etc i.e. all kinds of information about project related information etc.

24. What is the difference between mapping parameter & mapping variable in data warehousing?

Mapping Parameter defines the constant value and it cannot change the value throughout the session.Mapping Variables defines the value and it can be change throughout the session

25. Explain the advantages of RAID 1, 1/0, and 5. what type of RAID setup would you put your TX logs.

The basic advantage of RAID is to speed up the data reading from permanent storage device (hard disk).

26. What is a data profile?

Data profiling is a way to find out what is the profile of the information contained in the source. E.g. In a table a column may be defined as alphanumeric. However, majority of the data may be numeric. Profiling tools will provide the statistical information about how many records have pure no. populated as against no. of records with alphanumeric data.Before data migration exercise, these tools provide vital clues about whether the exercise is going to be a success or a failure. This can help is changing the target schema or applying cleanse at the source level so that most of the records can get in the destination database.In DW these tools are used at the design stage for the same purpose. Some tool vendors who sell this as a product call this as data discovery phase.

27. "A dimension table is wide but the fact table is deep," Explain the statement in your own words.

Dimension table has got all the detail in formations of their respective table ,for egg, customer dimension table will contain all the related info about customers whereas fact table contains the main data, which contains the surrogate keys of every dimension; along with other measures.

28. What is the difference between aggregate table and materialized view?

Aggregate tables are pre-computed totals in the form of hierarchical multidimensional structure., whereas materialized view ,is an database object which caches the query result in a concrete table and updates it from the original database table from time to time .Aggregate tables are used to speed up the query computing whereas materialized view speed up the data retrieval .

29. What is the difference between OLAP and OLTP?

OLAP stands for online analytical processing. In this, we have access to live data. This process contains historical information to analyze. Data needs to be integrated. We can create reports that are multi-dimensional, supported by time-based analysis and ideal for applications with unpredictable, ad hoc query requirements.OLTP stands for online transaction processing. OLTP databases are fully normalized and are designed for consistently store operational data, one transaction at a time. It performs day-to -day operations and not support historical data.

31. What is Execution Plan?

The combination of the steps the optimizer chooses to execute a statement is called an execution plan.

32. What is the function of Optimizer?

The goal of the optimizer is to choose the most efficient way to execute a SQL statement.

33. What is the effect of setting the value "CHOOSE" for OPTIMIZER_GOAL, parameter of the ALTER SESSION Command?

The Optimizer chooses Cost based approach and optimizes with the goal of best throughput if statistics for at least one of the tables accessed by the SQL statement exist in the data dictionary. Otherwise the OPTIMIZER chooses RULE based approach.

34. What does a Control file Contain?

A Control file records the physical structure of the database. It contains the following information. Database Names and locations of a database's files and redo log files. And Time stamp of database creation.

35. How do you define Data Block size?

A data block size is specified for each ORACLE database when the database is created. A database users and allocated free database space in ORACLE data blocks. Block size is specified in INIT.ORA file and can't be changed latter.