Differences Between Inmon and Kimball Data Warehousing Philosophies

Data ware Housing Concepts: What is the main difference between Inmon and Kimball philosophies of data warehousing
According to Kimball... Kimball views data warehousing as a constituency of data marts. Data marts are focused on delivering business objectives for departments in the organization. And the data warehouse is a confirmed dimension of the data marts. Hence a unified view of the enterprise can be obtained from the dimension modeling on a local departmental level. Inman beliefs in creating a data warehouse on a subject-by-subject area basis. Hence the development of the data warehouse can start with data from the online store. ther subject areas can be added to the data warehouse as their needs arise. !oint-of-sale "! #$ data can be added later if management decides it is necessary. What is junk dimension? What is the difference between junk dimension and degenerated dimension A %jun&% dimension is a collection of random transactional codes' flags and(or te)t attributes that are unrelated to any particular dimension. *he jun& dimension is simply a structure that provides a convenient place to store the jun& attributes. +here as A degenerate dimension is data that is dimensional in nature but stored in a fact table. What is the definition of normalized and denormalized iew and what are the differences between them ,ormalization is the process of removing redundancies. Denormalization is the process of allowing redundancies. -*! uses the ,ormalization process and the -A!(D+ uses uses the denormalised process to capture greater level of detailed data"each and every tranaction$ wh! fact table is in normal form? A fact table consists of measurements of business re.uirements and foreign &eys of dimensions tables as per business rules. /asically the fact table consists of the Inde) &eys of the dimension(loo& up tables and the measures. so when ever we have the &eys in a table .that itself implies that the table is in the normal form. What is Difference between "#$ %odeling and Dimensional %odeling /asic diff is 0-1 modeling will have logical and physical model. Dimensional model will have only physical model. 0-1 modeling is used for normalizing the -*! database design. Dimensional modeling is used for de-normalizing the 1 -A!(2 -A! design
what is conformed fact? 3. 4onformed dimensions are the dimensions which can be used across multiple Data 2arts in combination with multiple facts tables accordingly 5.4onformed facts are allowed to have the same name in separate tables and can be combined and compared mathematically. What are the methodologies of Data Warehousing? 0very company has methodology of their own. /ut to name a few #D-4 2ethodology' AI2 methodology are stardadly used. ther methodologies are A22' +orld class methodology and many more 2ost of the time 'we use 2r. 1alph Kimball methodologies for datawarehousing design.*wo &ind of schema 6star and snow fla&e. what is &'( (chema? /7# #chema is composed of a master suite of confirmed dimension and standardized definition if facts. A /7# #chema or a /7# 2atri)8 A /7# 2atri) "in Kimball approach$ is to identify common Dimensions across /usiness !rocesses9 i.e.6 a way of identifying 4onforming Dimensions. What is Data warehousing Hierarch!? Hierarchies Hierarchies are logical structures that use ordered levels as a means of organizing data. A hierarchy can be used to define data aggregation. :or e)ample' in a time dimension' a hierarchy might aggregate data from the month level to the .uarter level to the year level. A hierarchy can also be used to define a navigational drill path and to establish a family structure. +ithin a hierarchy' each level is logically connected to the levels above and below it. Data values at lower levels aggregate into the data values at higher levels. A dimension can be composed of more than one hierarchy. :or e)ample' in the product dimension' there might be two hierarchies--one for product categories and one for product suppliers. Dimension hierarchies also group levels from general to granular. ;uery tools use hierarchies to enable you to drill down into your data to view different levels of granularity. *his is one of the &ey benefits of a data warehouse. +hen designing hierarchies' you must consider the relationships in business structures. :or e)ample' a divisional multilevel sales organization. Hierarchies impose a family structure on dimension values. :or a particular level value' a value at the ne)t higher level is its parent' and values at the ne)t lower level are its children. *hese familial relationships enable analysts to access data .uic&ly. )e els A level represents a position in a hierarchy. :or e)ample' a time dimension might have a hierarchy that represents data at the month' quarter' and year levels. -evels range from general to specific' with the
root level as the highest or most general level. *he levels in a dimension are organized into one or more hierarchies. )e el $elationships -evel relationships specify top-to-bottom ordering of levels from most general "the root$ to most specific information. *hey define the parent-child relationship between the levels in a hierarchy. Hierarchies are also essential components in enabling more comple) rewrites. :or e)ample' the database can aggregate e)isting sales revenue on a .uarterly base to a yearly aggregation when the dimensional dependencies between .uarter and year are &nown. What is data alidation strategies for data mart alidation after loading process Data validation is to ma&e sure that the loaded data is accurate and meets the business re.uirements. #trategies are different methods followed to meet the validation re.uirements What r the data t!pes present in bo? n what happens if we implement iew in the designer n report *hree different data types6 Dimensions' 2easure and Detail. <iew is nothing but an alias and it can be used to resolve the loops in the universe. What is surrogate ke!? Where we use it e*plain with e*amples #urrogate &ey is a substitution for the natural primary &ey. It is just a uni.ue identifier or number for each row that can be used for the primary &ey to the table. *he only re.uirement for a surrogate primary &ey is that it is uni.ue for each row in the table. #ome tables have columns such as AI1! 1*=,A20 or 4I*>=,A20 which are stated as the primary &eys "according to the business users$ but 'not only can these change' inde)ing on a numerical value is probably better and you could consider creating a surrogate &ey called' say' AI1! 1*=ID. *his would be internal to the system and as far as the client is concerned you may display only the AI1! 1*=,A20. What is a linked cube? A cube can be stored on a single analysis server and then defined as a lin&ed cube on other Analysis servers. 0nd users connected to any of these analysis servers can then access the cube. *his arrangement avoids the more costly alternative of storing and maintaining copies of a cube on multiple analysis servers. -in&ed cubes can be connected using *4!(I! or H**!. *o end users a lin&ed cube loo&s li&e a regular cube. A cube can be portioned in ? ways. 1eplicate' *ransparent and -in&ed. In the lin&ed cube the data cells can be lin&ed in to another analytical database. If an end-user clic&s on a data cell' you are actually lin&ing through another analytic database.
What is meant b! metadata in conte*t of a Data warehouse and how it is important? 2etadata or 2eta Data 2etadata is data about data. 0)amples of metadata include data element descriptions' data type descriptions' attribute(property descriptions' range(domain descriptions' and process(method descriptions. *he repository environment encompasses all corporate metadata resources6 database catalogs' data dictionaries' and navigation services. 2etadata includes things li&e the name' length' valid values' and description of a data element. 2etadata is stored in a data dictionary and repository. It insulates the data warehouse from changes in the schema of operational systems. 2etadata #ynchronization *he process of consolidating' relating and synchronizing data elements with the same or similar meaning from different systems. 2etadata synchronization joins these differing elements together in the data warehouse to allow for easier access. +, What is incremental loading? -, What is batch processing? ., What is crass reference table? /, What is aggregate fact table? Incremental loading means loading the ongoing changes in the -*!. Aggregate table contains the @measureA values 'aggregated (grouped(summed up to some level of hirarchy. What are the possible data marts in $etail sales,? product information'sales information What is the main differnce between schema in $D&%( and schemas in DataWarehouse,,,,? 1D/2# #chema B 7sed for -*! systems B *raditional and old schema B ,ormalized B Difficult to understand and navigate B 4annot solve e)tract and comple) problems B !oorly modelled D+H #chema B 7sed for -A! systems B ,ew generation schema B De ,ormalized B 0asy to understand and navigate B 0)tract and comple) problems can be easily solved B <ery good model What are the aious "0) tools in the %arket 3.Informatica !ower 4enter 5. Ascential Data #tage ?. 0## /ase Hyperion C. Ab Intio D. / Data Integrator E. #A# 0*-
F. 2# D*# G. racle +/ H. !ervasive Data Iunction 3J. 4ognos Decision #tream What is Dimensional %odelling In Dimensional 2odeling' Data is stored in two &inds of tables6 :act *ables and Dimension tables. :act *able contains fact data e.g. sales' revenue' profit etc..... Dimension table contains dimensional data such as !roduct Id' product name' product description etc..... What is 1)D& very large database *he perception of what constitutes a <-D/ continues to grow. A one terabyte database would normally be considered to be a <-D/. What is real time data#warehousing 1eal-time data warehousing is a combination of two things6 3$ real-time activity and 5$ data warehousing. 1eal-time activity is activity that is happening right now. *he activity could be anything such as the sale of widgets. nce the activity is complete' there is data about it. Data warehousing captures business activity data. 1eal-time data warehousing captures business activity data as it occurs. As soon as the business activity is complete and there is data about it' the completed activity data flows into the data warehouse and becomes available instantly. In other words' real-time data warehousing is a framewor& for deriving information from data as the data becomes available. In real#time data warehousing' your warehouse contains completely up-to-date data and is synchronized with the source systems that provide the source data. In near-real-time data warehousing' there is a minimal delay between source data being generated and being available in the data warehouse. *herefore' if you want to achieve real-time or near-real-time updates to your data warehouse' youKll need to do three things6 3. 1educe or eliminate the time ta&en to get new and changed data out of your source systems. 5. 0liminate' or reduce as much as possible' the time re.uired to cleanse' transform and load your data. ?. 1educe as much as possible the time re.uired to update your aggregates. #tarting with version Hi' and continuing with the latest 3Jg release' racle has gradually introduced features into the database to support real-time' and near-real-time' data warehousing. *hese features include6

4hange Data 4apture 0)ternal tables' table functions' pipelining' and the 201L0 command' and :ast refresh materialized views
What is a lookup table +hen a table is used to chec& for some data for its presence prior to loading of some other data or the same data to another table' the table is called a - K7! *able when we want to get related value from some other table based on particular value... suppose in one table A we have two columns emp=id'name and in other table / we have emp=id adress in target table we want to have emp=id'name'address we will ta&e source as table A and loo& up table as / by matching 02p=id we will get the result as three columns...emp=id'name'address A loo&up table is nothing but a Mloo&upM it give values to referenced table "it is a reference$' it is used at the run time' it saves joins and space in terms of transformations. 0)ample' a loo&up table called states' provide actual state name "M*e)asM$ in place of *N to the output. What is a general purpose scheduling tool Leneral purpose of scheduling tool may be cleansing and loading data at specific given time *he basic purpose of the scheduling tool in a D+ Application is to stream line the flow of data from #ource *o *arget at specific time or based on some condition. What t!pe of Inde*ing mechanism do we need to use for a t!pical datawarehouse bitmap inde) :unction Inde)' /-tree Inde)' !artition Inde)' Hash inde) etc.. n the fact table it is best to use bitmap inde)es. Dimension tables can use bitmap and(or the other types of clustered(non-clustered' uni.ue(non-uni.ue inde)es. *o my &nowledge' #;-#erver does not support bitmap inde)es. nly racle supports bitmaps. "*plain the ad anatages of $2ID +3 +453 and 6, What t!pe of $2ID setup would !ou put !our 07 logs 1aid J - 2a&e several physical hard drives loo& li&e one hard drive. ,o redundancy but very fast. 2ay use for temporary spaces where loss of the files will not result in loss of committed data. 1aid 3- 2irroring. 0ach hard drive in the drive array has a twin. 0ach twin has an e)act copy of the other twins data so if one hard drive fails' the other is used to pull the data. 1aid 3 is half the speed of 1aid J and the read and write performance are good. 1aid 3(J - #triped 1aid J' then mirrored 1aid 3. #imilar to 1aid 3. #ometimes faster than 1aid 3. Depends on vendor implementation. 1aid D - Lreat for readonly systems. +rite performance is 3(?rd that of 1aid 3 but 1ead is same as 1aid 3. 1aid D is great for D+ but not good for -*!. Hard drives are cheap now so I always recommend 1aid 3. What does le el of 8ranularit! of a fact table signif!
It describes the amount of space re.uired for a database. -evel of Lranularity indicates the e)tent of aggregation that will be permitted to ta&e place on the fact data. 2ore Lranularity implies more aggregation potential and vice-versa. What is data mining Data mining is a process of e)tracting hidden trends within a datawarehouse. :or e)ample an insurance dataware house can be used to mine data for the most high ris& people to insure in a certain geographial area. What is degenerate dimension table? the values of dimension which is stored in fact table is called degenerate dimensions. these dimensions doesn't have its own dimensions. A attribute in fact table itKs not a fact and itKs not a &ey value How do !ou load the time dimension In Dataware house we manually load the time dimension 0very Datawarehouse maintains a time dimension. It would be at the most granular level at which the business runs at "e)6 wee& day' day of the month and so on$. Depending on the data loads' these time dimensions are updated. +ee&ly process gets updated every wee& and monthly process' every month. What is "$ Diagram *he 0ntity-1elationship "01$ model was originally proposed by !eter in 3HFE @4henFEA as a way to unify the networ& and relational database views. #imply stated the 01 model is a conceptual data model that views the real world as entities and relationships. A basic component of the model is the 0ntity-1elationship diagram which is used to visually represents data objects. #ince 4hen wrote his paper the model has been e)tended and today it is commonly used for database design :or the database designer' the utility of the 01 model is6 it maps well to the relational model. *he constructs used in the 01 model can easily be transformed into relational tables. it is simple and easy to understand with a minimum of training. *herefore' the model can be used by the database designer to communicate the design to the end user. In addition' the model can be used as a design plan by the database developer to implement a data model in a specific database management software. Difference between (now flake and (tar (chema, What are situations where (now flake (chema is better than (tar (chema to use and when the opposite is true? star schema and snowfla&e both serve the purpose of dimensional modeling when it come to datawarehouses. star schema is a dimensional model with a fact table " large$ and a set of dimension tables " small$ . the
whole set-up is totally denormalized. however in cases where the dimension table are split to many table that is where the schema is slighly inclined towards normalization " reduce redundancy and dependency$ there comes the snow fla&e schema. the nature(purpose of the data that is to be feed to the model is the &ey to your .uestion as to which is better #tar schema contains the dimesion tables mapped around one or more fact tables. It is a denormalised model. ,o need to use complicated joins. ;ueries results fastly. #nowfla&e schema It is the normalised form of #tar schema. contains indepth joins 'bcas the tbales r splitted in to many pieces.+e can easily do modification directly in the tables. +e hav to use comlicated joins 'since we hav more tables . *here will be some delay in processing the ;uery . What is a C'&" in datawarehousing concept 4ubes are logical representation of multidimensional data.*he edge of the cube contains dimension members and the body of the cube contains data values. 4ube is a logical schema which contains facts and dimentions
What is 9D( D# stands for nline Data #torage. It is used to maintain' store the current and up to date information and the transactions regarding the source databases ta&en from the -*! system. It is directly connected to the source database systems instead of to the staging area. It is further connected to data warehouse and moreover can be treated as a part of the data warehouse database. Which columns go to the fact table and which columns go the dimension table
*he Aggreation or calculated value colums will go to :ac *ablw and details information will go to diamensional table. What are conformed dimensions *hey are dimension tables in a star schema data mart that adhere to a common structure' and therefore allow .ueries to be e)ecuted across star schemas. :or e)ample' the 4alendar dimension is commonly needed in most data marts. /y ma&ing this 4alendar dimension adhere to a single structure' regardless of what data mart it is used in your organization' you can .uery by date(time from one data mart to another to another. 4onformed dimentions are dimensions which are common to the cubes."cubes are the schemas contains facts and dimension tables$ 4onsider 4ube-3 contains :3'D3'D5'D? and 4ube-5 contains :5'D3'D5'DC are the :acts and Dimensions here D3'D5 are the 4onformed Dimensions What is :ormalization3 ;irst :ormal ;orm3 (econd :ormal ;orm 3 0hird :ormal ;orm 3,.:6- *he table should caontain scalar or atomic values. 5 ,.:6- *able should be in 3,.: O ,o partial functional dependencies ? ,.: 6-*able should be in 5 ,.: O ,o transitive dependencies What are non#additi e facts ,on-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table. 0)ample6 temparature'bill number...etc fact table typically has two types of columns6 those that contain numeric facts "often called measurements$' and those that are foreign &eys to dimension tables. A fact table contains either detail-level facts or facts that have been aggregated. :act tables that contain aggregated facts are often called summary tables. A fact table usually contains facts with the same level of aggregation. *hough most facts are additive' they can also be semi-additive or non-additive. Additive facts can be aggregated by simple arithmetical addition. A common e)ample of this is sales. ,on-additive facts cannot be added at all. An e)ample of this is averages. #emi-additive facts can be aggregated along some of the dimensions and not along others. An e)ample of this is inventory levels' where you cannot tell what a level means simply by loo&ing at it. How are the Dimension tables designed 2ost dimension tables are designed using ,ormalization principles upto 5,:. In some instances they are further normalized to ?,:. :ind where data for this dimension are located. :igure out how to e)tract this data.
Determine how to maintain changes to this dimension "see more on this in the ne)t section$. 4hange fact table and D+ population routines. Wh! should !ou put !our data warehouse on a different s!stem than !our 9)0< s!stem -*! system stands for on-line transaction processing. *hese are used to store only daily transactions as the changes have to be made in as few places as possible. -*! do not have historical data of the organization Datawarehouse will contain the historical information about the organization What is ;act table A table in a data warehouse whose entries describe data in a fact table. Dimension tables contain the data from which dimensions are created What are (emi#additi e and factless facts and in which scenario will !ou use such kinds of fact tables #emi-Additive6 #emi-additive facts are facts that can be summed up for some of the dimensions in the fact table' but not the others. :or e)ample6 4urrent=/alance and !rofit=2argin are the facts. 4urrent=/alance is a semi-additive fact' as it ma&es sense to add them up for all accounts "whatMs the total current balance for all accounts in the ban&8$' but it does not ma&e sense to add them up through time "adding up all current balances for a given account for each day of the month does not give us any useful information What is a le el of 8ranularit! of a fact table -evel of granularity means level of detail that you put into the fact table in a data warehouse. :or e)ample6 /ased on design you can decide to put the sales data in each transaction. ,ow' level of granularity would mean what detail are you willing to put for each transactional fact. !roduct sales with respect to each minute or you want to aggregate it upto minute and put that data. What are the Different methods of loading Dimension tables they are of two types insert--P if it is not there in the dimension and update--P if it e)ists. 4onventional -oad6 /efore loading the data' all the *able constraints will be chec&ed against the data. Direct load6":aster -oading$ All the 4onstraints will be disabled. Data will be loaded directly.-ater the data will be chec&ed against the table constraints and the bad data wonMt be inde)ed. What are 2ggregate tables Aggregate tables contain redundant data that is summarized from other data in the warehouse.
*hese are the tables which contain aggregated ( summarized data. 0.g >early' monthly sales information. *hese tables will be used to reduce the .uery e)ecution time. What is a dimension table A dimesion table in datawarehouse is one which contains primary &ey and attributes.we called primary &ey as DI2IDMs"dimension idMs$. A dimensional table is a collection of hierarchies and categories along which the user can drill down and drill up. it contains only the te)tual attributes. Wh! are 9)0< database designs not generall! a good idea for a Data Warehouse -*! cannot store historical information about the organization. It is used for storing the details of daily transactions while a datawarehouse is a huge storage of historical information obtained from different datamarts for ma&ing intelligent decisions about the organization. How to convert java applets into image file? What is the difference between the hot key and shortcut key? Wh! do we o erride the e*ecute method is struts? <lz gi e me the details? As part of #truts :rame +or& we can develop the Action #ervlet' Action :orm servlets"here Action #ervlet means which class e)tends the Action class is called Action #ervlet and Action :ome means which class e)tends the Action :orm class is called the Action :orm servlet and other servlets classes. In case of Action :orm class we can develop the validate "$.this method will return the Action 0rrors object. In this method we can write the validation code. If this method return null or Action 0rrors with sizeQJ'the web container will call the e)ecute"$ as part of the Action class. if it returns size P J it will not be call the e)ecute"$.it will e)ecute the jsp' servlet or html file as value for the input attribute as part of the Raction -mappingP attribute in struts-config.)ml file. In case of Action class the e)ecute"$ method return the Action :orward object. in e)ecute"$ we can write "return mapping. find :orward"%success%$9$here mapping is the object for the Action 2apping class. After that it will forward the re.uest to the %success% jsp file."here success is conte)t path for the jsp file'it is written in web.)ml.
What is snapshot8 >ou can disconnect the report from the catalog to which it is attached by saving the report with a snapshot of the data. However' you must reconnect to the catalog if you want to refresh the data. If star inde* of fact table get corrupted3 is it possible to load it, 0hen how do load fact table? Is 9)2< databases are called decision support s!stem ??? true4false? What is acti e data warehousing? An acti e data warehouse provides information that enables decision-ma&ers within an organization to manage customer relationships nimbly' efficiently and proactively. Active data warehousing is all about integrating advanced decision support with day-to-day-even minute-to-minute-decision ma&ing in a way that increases .uality of those customer touches which encourages customer loyalty and thus secure an
organizationMs bottom line. *he mar&etplace is coming of age as we progress from first-generation %passive% decision-support systems to current- and ne)t-generation %active% data warehouse implementations Active Dataware house means 0very user can access the database any time 5C(F wh! Denormalization is promoted in 'ni erse Designing? In a relational data model' for normalization purposes' some loo&up tables are not merged as a single table. In a dimensional data modeling"star schema$' these tables would be merged as a single table called DI20,#I , table for performance and slicing data.Due to this merging of tables into one large Dimension table' it comes out of comple) intermediate joins. Dimension tables are directly joined to :act tables.*hough' redundancy of data occurs in DI20,#I , table' size of DI20,#I , table is 3DS only when compared to :A4* table. #o only Denormalization is promoted in 7niverse Desinging. e*plain in detail about t!pe +3 t!pe -=(CD>3 t!pe . ? Type 1 : overwrite data is to be there. Type 2 : current, recent and history data should be there. Type 3: current and recent data should be there what are non#additi e facts in detail? 2dditi e6 Additive facts are facts that can be summed up through all of the dimensions in the fact table. (emi#2dditi e6 #emi-additive facts are facts that can be summed up for some of the dimensions in the fact table' but not the others. :on#2dditi e6 ,on-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table.
What are non#additi e facts? :on#2dditi e6 ,on-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table. Datastage6 How do !ou do 'sage anal!sis in datastage ? How to remo e duplicates in ser er job?
3$7se a hashed file stage or 5$ If you use sort command in 7,IN"before job sub-routine$' you can reject duplicated records using -u parameter or ?$using a #ort stage ?Will Datastage consider the second constraint in the transformer if the first constraint is satisfied =if link ordering is gi en>?? >es What are constraints and deri ation?@ "*plain the process of taking backup in Data(tage?@ What
are the different types of lookups available in DataStage?
4onstraints are used to chec& for a condition and filter the data. 0)ample6 4ust=IdRPJ is set as a constraint and it means and only those records meeting this will be processed further. Derivation is a method of deriving the fields' for e)ample if you need to get some #72'A<L etc What are the difficulties faced in using Data(tage ? or what are the constraints in using Data(tage 3$If the number of loo&ups are more8 5$what will happen' while loading the data due to some regions job aborts8 what is :)( in datastage? how we use :)( in Datastage ? what ad antages in that ? at the time of installation i am not choosen that :)( option 3 now i want to use that options what can i do ? to reinstall that datastage or first uninstall and install once again ? Iust reinstall you can see the option to include the ,-# What is a project? (pecif! its arious components? >ou always enter Data#tage through a Data#tage project. +hen you start a Data#tage client you are prompted to connect to a project. 0ach project contains6 Data#tage jobs. /uilt-in components. *hese are predefined components used in a job. 7ser-defined components. *hese are customized components created using the DataStage Manager
Data Stage Designer
or
&riefl! describe the arious client components? *here are four client components Data #tage Designer. A design interface used to create Data#tage applications "&nown as jobs$. 0ach job specifies the data sources' the transforms re.uired' and the destination of the data. Iobs are compiled to create e)ecutables that are scheduled by the Director and run by the #erver. Data #tage Director. A user interface used to validate' schedule' run' and monitor Data#tage jobs.
Data #tage 2anager. A user interface used to view and edit the contents of the 1epository. Data #tage Administrator. A user interface used to configure Data#tage projects and users. What are the (teps in ol ed in de elopment of a job in Data(tage? *he steps re.uired are6 select the datasource stage depending upon the sources for e)6flatfile'database' )ml etc select the re.uired stages for transformation logic such as transformer'lin& collector'lin& partitioner' Aggregator' merge etc select the final target stage where u want to load the data either it is datawatehouse' datamart' D#'staging etc How does Data(tage handle the user securit!? we have to create users in the Administrators and give the necessary priviliges to users. what is meaning of file e*tender in data stage ser er jobs, can we run the data stage job from one job to another job that file data where it is stored and what is the file e*tender in ds jobs? file e)tender means the adding the columns or records to the already e)isting the file' in the data stage' we can run the data stage job from one job to another job in data stage. What is the difference between drs and odbc stage D1# and D/4 stage are similar as both use the pen Database 4onnectivity to connect to a database. !erformance wise there is not much of a difference.+e use D1# stage in parallel jobs. how to use rankAupdatestraterg! in datastage What is the ma* capacit! of Hash file in Data(tage? *a&e a loo& at the uvconfig file6
# 64BIT_FILES - This sets the default mode used to # create static hashed and d namic files! # " #alue of $ results in the creation of %&-'it # files! %&-'it files ha#e a ma(imum file si)e of # & giga' tes! " #alue of * results in the creation # of 64-'it files +,-L. #alid on 64-'it ca/a'le /latforms0! # The ma(imum file si)e for 64-'it # files is s stem de/endent! The default 'eha#ior # ma 'e o#erridden ' 1e 2ords on certain commands! 64BIT_FILES $
How I can con ert (er er Bobs into <arallel Bobs? u cant convert server to parallel T u have to rebuild whole graph..
*here is no machanism to convert server jobs into parlell jobs. u need to re design the jobs in parlell environment using parlell job stages. How much would be the size of the database in Data(tage ? What is the difference between Inprocess and Interprocess ? 1egarding the database it varies and dependa upon teh project and for the second .uestion 'inprocess is the process where teh server transfers only one row at a tiem to target and interprocess means that the server sends group of rows to the target table...these both are available at the tunables tab page of the administrator client component.. In-process >ou can improve the performance of most Data#tage jobs by turning in-process row buffering on and recompiling the job. *his allows connected active stages to pass data via buffers rather than row by row. :ote: >ou cannot use in-process row-buffering if your job uses 4 22 , bloc&s in transform functions to pass data between stages. *his is not recommended practice' and it is advisable to redesign your job to use row buffering rather than 4 22 , bloc&s. Inter-process 7se this if you are running server jobs on an #2! parallel system. *his enables the job to run using a separate process for each active stage' which will run simultaneously on a separate processor. :ote6 >ou cannot inter-process row-buffering if your job uses 4 22 , bloc&s in transform functions to pass data between stages. *his is not recommended practice' and it is advisable to redesign your job to use row buffering rather than 4 22 , bloc&s. Can !ou con ert a snow flake schema into star schema? Is it possible to mo e the data from oracle ware house to (2< Warehouse using with D202(028" 0ool, +e can use Data#tage 0)tract !ac& for #A! 1(? and Data#tage -oad !ac& for #A! /+ to transfer the data from oracle to #A! +arehouse. *hese !lug In !ac&s are available with Data#tage <ersion F.D how to implement t!pe- slowl! changing dimensions in data stage?e*plain with e*ample? +e can handle #4D in the following ways 0!pe +6 Iust use' CInsert rows "lse 'pdate rowsD r C'pdate rows "lse Insert rowsD3 in update action of target 0!pe -6 7se the steps as follows a$ 7 have use one hash file to -oo&-7p the target b$ *a&e ? instances of target c$ Live different conditions depending on the process d$ Live different update actions in target e$ 7se system variables li&e #ysdate and ,ull.
If a Data(tage job aborts after sa! +555 records3 how to continue the job from +555th record after fi*ing If an error is fi)ed on the job where it failed then job continues leaving that error part /y specifying 4hec&pointing in job se.uence properties' if we restart the job. *hen job will start by s&ipping upto the failed record.this option is available in F.D edition. What is the :)( eEui alent to :)( oracle code 2mericanF2merica,'(G2(CII on Datastage :)(? what is 9CI? If you mean by racle 4all Interface " 4I$' it is a set of low-level A!Is used to interact with racle databases. It allows one to use operations li&e logon' e)ecute' parss etc. using a 4 or 4OO program what is hashing algorithm and e*plain breafl! how it works? hashing is &ey-to-address translation. *his means the value of a &ey is transformed into a dis& address by means of an algorithm' usually a relative bloc& and anchor point within the bloc&. ItMs closely related to statistical probability as to how well the algorithms wor&. It sounds fancy but these algorithms are usually .uite simple and use division and remainder techni.ues. Any good boo& on database systems will have information on these techni.ues. Interesting to note that these approaches are called %2onte 4arlo *echni.ues% because the behavior of the hashing or randomizing algorithms can be simulated by a roulette wheel where the slots represent the bloc&s and the balls represent the records "on this roulette wheel there are many balls not just one$. A hashing algorithm ta&es a variable length data message and creates a fi)ed size message digest.+hen a one-way hashing algorithm is used to generate the message digest the input cannot be determined from the output.. A mathematical function coded into an algorithm that ta&es a variable length string and changes it into a fi)ed length string' or hash value. it is possible to call one job in another job in ser er jobs? +e cannot call one job within another in Data#tage' however we can write a wrapper to access the jobs in a stated se.uence.+e can also use se.uencer to se.uence the series of jobs. I thin& we can call a job into another job. In fact calling doesnMt sound good' because you attach(add the other job through job properties. In fact' you can attach zero or more jobs. #teps will be 0dit --P Iob !roperties --P Iob 4ontrol 4lic& on Add Iob and select the desired job. What about (!stem ariables? How can we create Containers? How can we impro e the performance of Data(tage? what are the Bob parameters? what is the difference between routine and transform and function? What are all the third part! tools used in Data(tage? How can we implement )ookup in Data(tage (er er jobs?
How can we implement (lowl! Changing Dimensions in Data(tage?, How can we join one 9racle source and (eEuential file?, What is icon and ocon functions? Difference between Hashfile and (eEuential ;ile? %a*imum how man! characters we can gi e for a Bob name in Data(tage? Answers for ur ;uestion in simple words65.4ontainers are nothing but the set of stages with lin&s?.*run in-process buffer and transaction sizeC.,othing but parameters to pass in runtimeD.E.-ots of thereF.using hashed :ileG.4D4 #tageH.3J.!owerful function for Date transformation33.Access speed is slow in se.uential file rather than hashed :ile35.If you &now just tell me If I add a new en ironment ariable in Windows3 how can I access it in Data(tage?0hanks in ad ance, u can call it in designer window under that job properties there u can add an new environment variable r u can use the e)isting one 7 can view all the environment variables in designer. 7 can chec& it in Iob properties. 7 can add and access the environment variables from Iob properties what are the enhancements made in datastage G,6 compare with G,5 2any new stages were introduced compared to datastage version F.J. In server jobs we have stored procedure stage' command stage and generate report option was there in file tab. In job se.uence many stages li&e startloop activity' end loop activity'terminate loop activity and user variables activities were introduced. In parallel jobs surrogate &ey stage' stored procedure stage were introduced. :or all other specifications' please refer to the manual.raj. As of my &nowledge the main enhancement i found is we can generate reports in F.D where u canMt in F.J and also we can import more plug-in stages inF.D 4omple) file and #urrogate &ey generator stages are added in <er F.D what is data set? and what is file set? I assume you are referring -oo&up fileset only.It is only used for loo&up stages only.Dataset6 Data#tage parallel e)tender jobs use data sets to manage data within a job. >ou can thin& of each lin& in a job as carrying a data set. *he Data #et stage allows you to store data being operated on in a persistent form' which can then be used by other Data#tage jobs.:ile#et6 Data#tage can generate and name e)ported files' write them to their destination' and list the files it has generated in a file whose e)tension is' by convention' .fs. *he data files and the file that lists them are called a file set. *his capability is useful because some operating systems impose a 5 L/ limit on the size of a file and you need to distribute files among nodes to prevent overruns file set6- It allows you to read data from or write data to a file set. *he stage can have a single input lin&. a single output lin&' and a single rejects lin&. It only e)ecutes in parallel mode*he data files and the file that lists them are called a file set. *his capability is useful because some operating systems impose a 5 L/ limit on the size of a file and you need to distribute files among nodes to prevent overruns.
Datasets r used to import the data in parallel jobs li&e odbc in server jobs How the hash file is doing lookup in ser erjobs?How is it comparing the ke! alues? Hashed :ile is used for two purpose6 3. 1emove Duplicate 1ecords 5. *hen 7sed for reference loo&ups.*he hashed file contains ? parts6 0ach record having Hashed Key' Key Header and Data portion./y using hashed algorith and the &ey valued the loo&up is faster. what are the differences between the data stage G,5 and G,6in ser er jobs? *here are lot of Diffrences6 *here are lot of new stages are available in D#F.D :or 0g6 4D4 #tage #tored procedure #tage etc.. It is possible to run parallel jobs in server jobs? ,o. we need 7,IN server to run parallel jobs. but we can create a job in windows os !4. how to handle the rejected rows in datastage? +e can handle by using constraints and store it in file or D/. we can handle rejected rows in two ways with help of 4onstraints in a *ansformer.3$ /y !utting on the 1ejected cell where we will be writing our constarints in the properties of the *ransformer5$7se 10I04*0D in the e)pression editor of the 4onstraint4reate a hash file as a temporory storage for rejected rows. 4reate a lin& and use it as one of the output of the transformer. Apply either ofthe two stpes above said on that -in&. All the rows which are rejected by all the constraints will go to the Hash :ile. What are orabulk and bcp stages? these are called as pilu-in stages orabul& is used when v have bul& data in oracle then v go for orabul& for other than oracle database v go for bcp stages. 1A/7-K is used to load bul& data into single table of target oracle database. /4! is used to load bul& data into a single table for microsoft s.l server and sysbase. how is datastage /,5 functionall! different from the enterprise edition now?? what are the e*act changes? *here are lot of 4hanges in D# 00. 4D4 #tage' !rocedure #tage' 0tc..........
what is the difference between datastage and informatica *he main difference between data stge and informatica is the #4A-A/I-*>..informatca is scalable than datastage In my view'Datastage is having less no. of transformers copared to Informatica which ma&es user to get difficulties while wor&ing Data(tage from (taging to %DW is onl! running at + row per secondH What do we do to remed!?
am assuming that there are too many stages' which is causing problem and providing the solution. In general. if you too many stages "especially transformers ' hash loo& up$' there would be a lot of overhead and the performance would degrade drastically. I would suggest you to write a .uery instead of doing several loo& ups. It seems as though embarassing to have a tool and still write a .uery but that is best at times. If there are too many loo& ups that are being done' ensure that you have appropriate inde)es while .uerying. If you do not want to write the .uery and use intermediate stages' ensure that you use proper elimination of data between stages so that data volumes do not cause overhead. #o' there might be a reordering of stages needed for good performance. ther things in general that could be loo&ed in6 3$ for massive transaction set hashing size and buffer size to appropriate values to perform as much as possible in memory and there is no I( overhead to dis&. 5$ 0nable row buffering and set appropate size for row buffering ?$ It is important to use appropriate objects between stages for performance what user aribale acti it! when it used how it used Hwhere it is used with real e*ample /y using *his 7ser variable activity we can create some variables in the job se.unce'this variables r available for all the activities in that se.unce. 2ost probablly this activity is U starting of the job se.unce what is the difference between buildopts and subroutines ? 0here are three different t!pes of user#created stages a ailable for <7, What are the!? Which would !ou use? What are the disad antage for using each t!pe? *hese are the three different stages6 i$ 4ustom ii$ /uild iii$ +rapped What is the e*act difference betwwen Boin3%erge and )ookup (tage?? *he e)act difference between Ioin'2erge and loo&up is *he three stages differ mainly in the memory they use Data#tage doesnMt &now how large your data is' so cannot ma&e an informed choice whether to combine data using a join stage or a loo&up stage. HereMs how to decide which to use6 if the reference datasets are big enough to cause trouble' use a join. A join does a high-speed sort on the driving and reference datasets. *his can involve I( if the data is big enough' but the I( is all highly optimized and se.uential. nce the sort is over the join processing is very fast and never involves paging or other I( 7nli&e Ioin stages and -oo&up stages' the 2erge stage allows you to specify several reject lin&s as many as input lin&s.
As of my &nowledge join and merge both u used to join two files of same structure where loo&up u mainly use it for to compare the prev data and the curr data. Can an! one tell me how to e*tract data from more than + hetrogenious (ources, mean3 e*ample + seEuenal file3 (!base 3 9racle in a singale Bob, >es you can e)tract the data from from two heterogenious sources in data stages using the the transformer stage itMs so simple you need to just form a lin& between the two sources in the transformer stage thatMs i can we use shared container as lookup in datastage ser er jobs? ya'we can use shared container as loo&up in server jobs. whereever we can use same loo&up in multiple places'on that time we will develop loo&up in shared containers'then we will use shared containers as loo&up. How can I specif! a filter command for processing data while defining seEuential file output data? +e have some thing called as after job subroutine and /efore subroutine' with then we can e)ecute the 7ni) commands. Here we can use the sort sommand or the filter cdommand what are alidations !ou perform after creating jobs in designer,what r the different t!pe of errors 4hec& for !arameters. and chec& for inputfiles are e)isted or not and also chec& for input tables e)isted or not and also usernames'datasource names'passwords li&e that What is the difference between Datastage and Datastage 07? Its a critical .uestion to answer' but one thing i can tell u that Datastage *) is not a 0*- tool V this is not a new version of Datastage F.D. *) is used for D# source 'this much i &now
Does the &ibhudata(tage 9racle plug#in better than 9CI plug#in coming from Data(tage? What is the &ibhudata(tage If data is partitioned in !our job on ke! + and then !ou aggregate on ke! -3 what issues could arise? data will partitioned on both the &eys T hardly it will ta&e more for e)ecution If !our running / wa!s parallel and !ou ha e +5 stages on the can as3 how man! processes does datastage create Answer is CJ >ou have 3J stages and each stage can be partitioned and run on C nodes which ma&es total number of processes generated are CJ
how can !ou do incremental load in datastage? >ou can create a table where u can store the last successfull refresh time for each table(Dimension. *hen in the source .uery ta&e the delta of the last successful and sysdate should give you incremental load. Incremental load means daily load. when ever you are selecting data from source' select the records which are loaded or updated between the timestamp of lastsuccessful load and todays load start date and time. for this u have to pass parameters for those two dates. store the last rundate and time in a file and read the parameter through job parameters and state second argument as currentdate and time. Does "nterprise "dition onl! add the parallel processing for better performance? 2re an! stages4transformations a ailable in the enterprise edition onl!? Data#tage #tandard 0dition was previously called Data#tage and Data#tage #erver 0dition. W Data#tage 0nterprise 0dition was originally called rchestrate' then renamed to !arallel 0)tender when purchased by Ascential. W Data#tage 0nterprise6 #erver jobs' se.uence jobs' parallel jobs. *he enterprise edition offers parallel processing features for scalable high volume solutions. Designed originally for 7ni)' it now supports +indows' -inu) and 7ni) #ystem #ervices on mainframes. W Data#tage 0nterprise 2<#6 #erver jobs' se.uence jobs' parallel jobs' mvs jobs. 2<# jobs are jobs designed using an alternative set of stages that are generated into cobol(I4- code and are transferred to a mainframe to be compiled and run. Iobs are developed on a 7ni) or +indows server transferred to the mainframe to be compiled and run. *he first two versions share the same Designer interface but have a different set of design stages depending on the type of job you are wor&ing on. !arallel jobs have parallel stages but also accept some server stages via a container. #erver jobs only accept server stages' 2<# jobs only accept 2<# stages. *here are some stages that are common to all types "such as aggregation$ but they tend to have different fields and options within that stage. 1ow 2erger' 1ow splitter are only present in parallel #tage How can !ou implement Comple* Bobs in datastage what do u mean by comple) jobs. if u used more than 3D stages in a job and if you used 3J loo&up tables in a job then u can call it as a comple) job 4omple) design means having more joins and more loo& ups. *hen that job design will be called as comple) job.+e can easily implement any comple) design in Data#tage by following simple tips in terms of increasing performance also. *here is no limitation of using stages in a job. :or better performance' 7se at the 2a) of 5J stages in each job. If it is e)ceeding 5J stages then go for another job.7se not more than F loo& ups for a transformer otherwise go for including one more transformer.Am I Answered for uMr abstract ;uestion.
how can u implement slowl! changed dimensions in datastage? e*plain? can u join flat file and database in datastage?how? >es' we can join a flat file and database in an indirect way. :irst create a job which can populate the data from database into a #e.uential file and name it as #e.=:irst. *a&e the flat file which you are having and use a 2erge #tage to join these two files. >ou have various join types in 2erge #tage li&e !ure Inner Ioin' -eft uter Ioin' 1ight uter Ioin etc.' >ou can use any one of these which suits your re.uirements. >es' we can do it in an indirect way. :irst create a job which can populate the data from database into a #e.uencial file and name it as #e.=:irst3. *a&e the flat file which you are having and use a 2erge #tage to join the two files. >ou have various join types in 2erge #tage li&e !ure Inner Ioin' -eft uter Ioin' 1ight uter Ioin etc.' >ou can use any one of these which suits your re.uirements. u can implement #4DMs in datastage #4D type3 just use Minsert rows else update rowsM or M update rows else insert rowsM in update action of target #4D type5 u have use one hash file to loo& -up the target 'ta&e ? instance of target 'give diff condns depending on the process'give diff update actions in target 'use system variables li&e sysdate 'null what is trouble shhoting in ser er jobs ? what are the diff kinds of errors encountered while running an! job? how to implement routines in data stage3ha e an! one has an! material for data stage pl send to me write the routine in 4 or 4OO' create the object file and place object in lib directory. now open disigner and goto routines configure the path and routine names what is the meaning of the following,, +>If an input file has an e*cessi e number of rows and can be split#up then use standard ->logic to run jobs in parallel .>0uning should occur on a job#b!#job basis, 'se the power of D&%( ;uestion is not clear eventhough i wil try to answer something If u have #2! machines u can use I!4'lin&-colector'lin&-partitioner for performance tuning If u have cluster'2!! machines u can use parallel jobs
what is the mean of 0r! to ha e the constraints in the I(electionI criteria of the jobs itself, 0his will eliminate the unnecessar! records e en getting in before joins are made? It probably means that u can put the selection criterai in the where clause'i.e whatever data u need to filter 'filter it out inthe #;- 'rather than carrying it forward and then filtering it out. 4onstraints is nothing but restrictions to data.here it is restriction to data at entry itself ' as he told it will avoid unnecessary data entry . How can "0) e*cel file to Datamart? ta&e the source file"e)cel file$ in the .csv format and apply the conditions which satisfies the datamart what is Data stage %ulti#b!te3 (ingle#b!te file con ersions?how we use that con ersions in data stage? what is difference between serverjobs V paraller jobs X (er er jobs. *hese are available if you have installed Data#tage #erver. *hey run on the Data#tage #erver' connecting to other data sources as necessary. X <arallel jobs. *hese are only available if you have installed 0nterprise 0dition. *hese run on Data#tage servers that are #2!' 2!!' or cluster systems. *hey can also run on a separate z( # "7##$ machine if re.uired. #erver jobs6 *hese are compiled and run on Data#tage #erver !arallel jobs6 *hese are available only if you have 0nterprise 0dition installed. *hese are compiled and run on a Data#tage 7ni) #erver' and can be run in parallel on #2!' 2!!' and cluster systems. what is merge ?and how to use merge? 2erge is a stage that is available in both parallel and server jobs. *he merge stage is used to join two tables"server(parallel$ or two tables(datasets"parallel$. 2erge re.uires that the master table(dataset and the update table(dataset to be sorted. 2erge is performed on a &ey field' and the &ey field is mandatory in the master and update dataset(table. how we use :)( function in Datastage? what are ad antages of :)( function? where we can use that one? e*plain briefl!? Dear 7ser'As per the manuals and documents' +e have different level of interfaces. 4an you be more specific8 -i&e *eradata interface operators' D/5 interface operators' racle Interface operators and #A#Interface operators. rchestrate ,ational -anguage #upport ",-#$ ma&es it possible for you toprocess data in international languages using 7nicode character sets.International 4omponents for 7nicode "I47$ libraries support ,-# functionalityin rchestrate. perator ,-# :unctionalityB *eradata Interface perators B switch perator B filter perator B *he D/5 Interface perators B *he racle Interface peratorsB *he #A#-Interface perators B transform perator B modify perator B import and e)port perators B generator perator #hould you need any further assistance pls let me &now. What is 2<0FC9:;I8 in datastage
!lease do read the manuals supplied with datastage. anyaways' the A!*=4 ,:IL=:I-0 "not just A!*=4 ,:IL$ is the configuration file that defines the nodes' "the scratch area' temp area$ for the specific project. Datastage understands the architecture of the system through this file"A!*=4 ,:IL=:I-0$. :or e)ample this file consists information of node names' dis& storage information...etc. what is the 9CI? and how to use the "0) 0ools? 4I doesnMt mean the orabul& data. It actually uses the % racle 4all Interface% of the oracle to load the data. It is &ind of the lowest level of racle being used for loading the data. How can I connect m! D&- database on 2(/55 to Data(tage? Do I need to use 9D&C +st to open the database connecti it! and then use an adapter for just connecting between the two? 0hanks alot of an! replies, >ou need to configure the D/4 connectivity for database "D/5 or A#CJJ$ in the datastage. How can I e*tract data from D&- =on I&% i(eries> to the data warehouse ia Datastage as the "0) tool, I mean do I first need to use 9D&C to create connecti it! and use an adapter for the e*traction and transformation of data? 0hanks so much if an!bod! could pro ide an answer, :rom db5 stage' we can e)tract the data in 0*>ou would need to install D/4 drivers to connect to D/5 instance "does not come with regular drivers that we try to install' use 4D provided for D/5 installation' that would have D/4 drivers to connect to D/5$ and then try out what is merge and how it can be done plz e*plain with simple e*ample taking - tables ,,,,,,, 2erge is used to join two tables.It ta&es the Key columns sort them in Ascending or descending order.-et us consider two table i.e 0mp'Dept.If we want to join these two tables we are having Dept,o as a common Key so we can give that column name as &ey and sort Deptno in ascending order and can join those two tables 2erge stage in used for only :lat files in server edition please list out the ersions of datastage <arallel 3 ser er editions and in which !ear the! are realised, !lease do fish for such &ind of info from the net. what happends out put of hash file is connected to transformer ,, what error it throughs If u connect output of hash file to transformer 'it will act li&e reference .there is no errores at allTT It can be used in implementing #4DMs
What is ersion Control? <ersion 4ontrol
stores different versions of D# jobs runs different versions of same job reverts to previos version of a job view version histories What are the $epositor! 0ables in Data(tage and What are the!? A datawarehouse is a repository"centralized as well as distributed$ of Data' able to answer any adhoc'analytical'historical or comple) .ueries.2etadata is data about data. 0)amples of metadata include data element descriptions' data type descriptions' attribute(property descriptions' range(domain descriptions' and process(method descriptions. *he repository environment encompasses all corporate metadata resources6 database catalogs' data dictionaries' and navigation services. 2etadata includes things li&e the name' length' valid values' and description of a data element. 2etadata is stored in a data dictionary and repository. It insulates the data warehouse from changes in the schema of operational systems.In data stage I( and *ransfer ' under interface tab6 input ' out put V transfer pages.7 will have C tabs and the last one is build under that u can find the *A/-0 ,A20 .*he Data#tage client components are6AdministratorAdministers Data#tage projects and conducts house&eeping on the serverDesigner4reates Data#tage jobs that are compiled into e)ecutable programs Director7sed to run and monitor the Data#tage jobs2anagerAllows you to view and edit the contents of the repository. What is I insert for update I in datastage ;uestion is not clear still' i thin& Minsert to updateM is updated value is inserted to maintain history how can we pass parameters to job b! using file >ou can do this' by passing parameters from uni) file' and then calling the e)ecution of a datastage job. the ds job has the parameters defined "which are passed by uni)$ where does uni* script of datastage e*ecutes weather in clinet machine or in ser er,suppose if it ee*cutes on ser er then it will e*ecute ? Datastage jobs are e)ecuted in the server machines only. *here is nothing that is stored in the client machine defaults nodes for datastage parallel "dition Actually the ,umber of ,odes depend on the number of processors in your system.If your system is supporting two processors we will get two nodes by default. What Happens if $C< is disable 1untime column propagation "14!$6 If 14! is enabled for any job' and specifically for those stage whose output connects to the shared container input' then meta data will be propagated at run time' so there is no need to map it at design time. If 14! is disabled for the job' in such case #H has to perform Import and e)port every time when the job runs and the processing time job is also increased. I want to process . files in seEuentiall! one b! one 3 how can i do that, while processing the files it should fetch files automaticall! ,
If the metadata for all the files r same then create a job having file name as parameter' then use same job in routine and call the job with different file name...or u can create se.uencer to use the job... (cenario based Juestion ,,,,,,,,,,, (uppose that / job control b! the seEuencer like =job +3 job -3 job .3 job / >if job + ha e +53555 row 3after run the job onl! 6555 data has been loaded in target table remaining are not loaded and !our job going to be aborted then,, How can short out the problem, #uppose job se.uencer synchronies or control C job but job 3 have problem' in this condition should go director and chec& it what type of problem showing either data type problem' warning massage' job fail or job aborted' If job fail means data type problem or missing column action .#o u should go 1un window -P4lic&-P *racing-P!erformance or In your target table -Pgeneral -P action-P select this option here two option "i$ n :ail -- commit ' 4ontinue "ii$ n #&ip -- 4ommit' 4ontinue. :irst u chec& how many data already load after then select on s&ip option then continue and what remaining position data not loaded then select n :ail ' 4ontinue ...... Again 1un the job defiantly u get successful massage Importance of (urrogate Ke! in Data warehousing #urrogate Key is a !rimary Key for a Dimension table. 2ost importance of using it is it is independent of underlying database. i.e #urrogate Key is not affected by the changes going on with a database. *he concept of surrogate comes into play when there is slowely changing dimension in a table. In such condition there is a need of a &ey by which we can identify the changes made in the dimensions. *hese slowely changing dimensions can be of three type namely #4D3'#4D5'#4D?. *hese are sustem genereated &ey.2ainly they are just the se.uence of numbers or can be alfanumeric values also whatIs the difference between Datastage De elopers and Datastage Designers, What are the skillIs reEuired for this, datastage developer is one how will code the jobs.datastage designer is how will desgn the job' i mean he will deal with blue prints and he will design the jobs the stages that are re.uired in developing the code How do we do the automation of dsjobs? %dsjobs% can be automated by using #hell scripts in 7,IN system +e can call Datastage /atch Iob from 4ommand prompt using MdsjobM. +e can also pass all the parameters from command prompt. *hen call this shell script in any of the mar&et available schedulers. *he 5nd option is schedule these jobs using Data #tage director. What is D( Director used for # did u use it? datastage director is used to run the jobs and validate the jobs. we can go to datastage director from datastage designer it self. What is D( %anager used for # did u use it? datastage maneger is used to e)port and import purpose @(/A main use of e)port and import is sharing the jobs and projects one project to other project.
What are t!pes of Hashed ;ile? Hashed :ile is classified broadly into 5 types. a$ #tatic - #ub divided into 3F types based on !rimary Key !attern. b$ Dynamic - sub divided into 5 types i$ Leneric ii$ #pecific. Default Hased file is %Dynamic - *ype 1andom ?J D% How do !ou eliminate duplicate rows? delete from from table name where rowid not in"select ma)(min"rowid$from emp group by column name Data #tage provides us with a stage 1emove Duplicates in 0nterprise edition. 7sing that stage we can eliminate the duplicates based on a &ey column. What about (!stem ariables? Data#tage provides a set of variables containing useful system information that you can access from a transform or routine. #ystem variables are read-only. UDA*0 *he internal date when the program started. #ee the Date function. UDA> *he day of the month e)tracted from the value in UDA*0. U:A-#0 *he compiler replaces the value with J. U:2 A field mar&' 4har"5DC$. UI2 An item mar&' 4har"5DD$. UI,1 +,72 Input row counter. :or use in constrains and derivations in *ransformer stages. U 7*1 +,72 utput row counter "per lin&$. :or use in derivations in *ransformer stages. U- L,A20 *he user login name. U2 ,*H *he current e)tracted from the value in UDA*0. U,7-- *he null value. U,7--.#*1 *he internal representation of the null value' 4har"35G$. U!A*H *he pathname of the current Data#tage project. U#4H02A *he schema name of the current Data#tage project. U#2 A subvalue mar& "a delimiter used in 7ni<erse files$' 4har"5D5$. U#>#*02.10*71,.4 D0 #tatus codes returned by system processes or commands.
U*I20 *he internal time when the program started. #ee the *ime function. U*2 A te)t mar& "a delimiter used in 7ni<erse files$' 4har"5D3$. U*170 *he compiler replaces the value with 3. U7#01, *he user number. U<2 A value mar& "a delimiter used in 7ni<erse files$' 4har"5D?$. U+H *he name of the current Data#tage project directory. U>0A1 *he current year e)tracted from UDA*0. 10I04*0D 4an be used in the constraint e)pression of a *ransformer stage of an output lin&. 10I04*0D is initially *170' but is set to :A-#0 whenever an output lin& is successfully written. What is D( Designer used for # did u use it? >ou use the Designer to build jobs by creating a visual design that models the flow and transformation of data from the data source through to the target warehouse. *he Designer graphical interface lets you select stage icons' drop them onto the Designer wor& area' and add lin&s. What is D( 2dministrator used for # did u use it? *he Administrator enables you to set up Data#tage users' control the purging of the 1epository' and' if ,ational -anguage #upport ",-#$ is enabled' install and manage maps and locales How will !ou call e*ternal function or subroutine from datastage? there is datastage option to call e)ternal programs . e)ec#H How do !ou pass filename as the parameter for a job? +hile job developement we can create a paramater M:I-0=,A20M and the value can be passed while running the job 3. Lo to Data#tage Administrator-P!rojects-P!roperties-P0nvironment-P7serDefined. Here you can see a grid' where you can enter your parameter name and the corresponding the path of the file. 5. Lo to the stage *ab of the job' select the ,-# tab' clic& on the %7se Iob !arameter% and select the parameter name which you have given in the above. *he selected parameter name appears in the te)t bo) beside the %7se Iob !arameter% button. 4opy the parameter name from the te)t bo) and use it in your job. Keep the project default in the te)t bo). How to handle Date con ertions in Datastage? Con ert a mm4dd4!!!! format to !!!!#dd#mm? +e use a$ %Iconv% function - Internal 4onvertion. b$ % conv% function - 0)ternal 4onvertion. :unction to convert mm(dd(yyyy format to yyyy-dd-mm is conv"Iconv":iledname'%D(2D>@5'5'CA%$'%D-2D>@5'5'CA%$
Here is the right conversion6 :unction to convert mm(dd(yyyy format to yyyy-dd-mm is conv"Iconv":iledname'%D(2D>@5'5'CA%$'%D->D2@C'5'5A%$ . Whats difference betweeen operational data stage =9D(> A data warehouse? that which is volatile is D# and the data which is nonvolatile and historical and time varient data is D+h data.in simple terms ods is dynamic data. How can we create Containers? 0here are 0wo t!pes of containers +,)ocal Container -,(hared Container )ocal container is a ailable for that particular Bob onl!, Where as (hared Containers can be used an! where in the project, )ocal container: (tep+:(elect the stages reEuired (tep-:"ditKConstructContainerK)ocal (haredContainer: (tep+:(elect the stages reEuired (tep-:"ditKConstructContainerK(hared (hared containers are stored in the (haredContainers branch of the 0ree (tructure How can we impro e the performance of Data(tage jobs? !erformance and tuning of D# jobs6 3.0stablish /aselines 5.Avoid the 7se of only one flow for tuning(performance testing ?.+or& in increment C.0valuate data s&ew D.Isolate and solve E.Distribute file systems to eliminate bottlenec&s
F.Do not involve the 1D/2# in intial testing G.7nderstand and evaluate the tuning &nobs available. what are the Bob parameters? *hese !arameters are used to provide Administrative access and change run time values of the job. 0DI*PI /!A1A20*01# In that !arameters *ab we can define the name'prompt'type'value What are all the third part! tools used in Data(tage? Autosys' *,L' event coordinator are some of them that I &now and wor&ed with How can we implement )ookup in Data(tage (er er jobs? +e can use a Hash :ile as a loo&up in server jobs. *he hash file needs atleast one &ey column to create by using the hashed files u can implement the loo&up in datasatge' hashed files stores data based on hashed algorithm and &ey values *he D/5 stage can be used for loo&ups. In the 0nterprise 0dition' the -oo&up stage can be used for doing loo&ups How can we join one 9racle source and (eEuential file?, Ioin and loo& up used to join oracle and se.uential file What is icon and ocon functions? Iconv and oconv are date conversion functions. !lz see help for more info Iconv" $-----converts string to internal storage format conv" $----converts an e)pression to an output format What are (eEuencers? #e.uencers are job control programs that e)ecute other jobs with preset Iob parameters A se.uencer allows you to synchronize the control flow of multiple activities in a job se.uence. It can have multiple input triggers as well as multiple output triggers.*he se.uencer operates in two modes6A-mode. In this mode all of the inputs to the se.uencer must be *170 for any of the se.uencer outputs to fire.A,> mode. In this mode' output triggers can be fired if any of the se.uencer inputs are *170regardsjagan What are $outines and where4how are the! written and ha e !ou written an! routines before? 1outines are stored in the 1outines branch of the Data#tage 1epository' where you can create' view or edit. *he following are different types of routines6 3$ *ransform functions
5$ /efore-after job subroutines ?$ Iob 4ontrol routines Do u know about %"02(028"? in simple terms metadata is data about data and metastge can be anything li&e D#"dataset's. file.etc 2eta#tage is used to handle the 2etadata which will be very useful for data lineage and data analysis later on. 2eta Data defines the type of data we are handling. *his Data Definitions are stored in repository and can be accessed with the use of 2eta#tage. Do !ou know about I:0"8$I0L4J'2)I0L stage? integriry(.uality stage is a data integration tool from ascential which is used to staderdize(integrate the data from different sources "*plain the differences between 9racleMi4Ni? mutliproceesing'databases more dimesnionsal modeling Did !ou work in ':I7 en ironment some times u need to write uni) progrms in bac& round T li&e batch progms T bcz data stage can invo&e a batch processing in every 5C hrs . soo.......uni) must... so that we can run the uni) prog in bac& round even min( hrs What are (tatic Hash files and D!namic Hash files? As the names itself suggest what they mean. In general we use *ype-?J dynamic Hash files. *he Data file has a default size of 5Lb and the overflow file is used if the data e)ceeds the 5L/ size. *he hashed files have the default size established by their modulus and separation when you create them' and this can be static or dynamic. verflow space is only used when data grows over the reserved size for someone of the groups "sectors$ within the file. *here are many groups as the specified by the modulus. Did !ou <arameterize the job or hard#coded the alues in the jobs? Always parameterized the job. 0ither the values are coming from Iob !roperties or from a Y!arameter 2anagerK Z a third part tool. *here is no way you will hardZcode some parameters in your jobs. *he often !arameterized variables in a job are6 D/ D#, name' username' password' dates +.1.* for the data to be loo&ed against at. 0ell me the en ironment in !our last projects Live the # of the #erver and the # of the 4lient of your recent most project How man! jobs ha e !ou created in !our last project?
3JJO jobs for every E months if you are in Development' if you are in testing CJ jobs for every E months although it need not be the same number for everybody What r 7%) files and how do !ou read data from 7%) files and what stage to be used? In the pallet there is 1eal time stages li&e )ml-input')ml-output')ml-transformer Wh! do !ou use (J) )92D"$ or 9CI (028"? +hen the source data is anormous or for bul& data we can use 4I and #;- loader depending upon the source Data +ill transfer very .uic&ly to the Data +arehouse by using #;- -oader. (uppose if there are million records did !ou use 9CI? if not then what stage do !ou prefer? using rabul& How do !ou populate source files? there are many ways to populate one is writting #;- statment in oracle is one way How do !ou pass the parameter to the job seEuence if the job is running at night? *wo ways 3. #te the default values of !arameters in the Iob #e.uencer and map these parameters to job. 5. 1un the job in the se.uencer using dsjobs utility where we can specify the values to be ta&en for each parameter What happens if the job fails at night? Iob #e.uence Abort What is (J) tuning? how do !ou do it ? in database using Hints s.l tunning can be done using cost based optimization this parameters are very important of pfile sort=area=size ' sort=area=retained=size'db=multi=bloc&=count'open=cursors'cursor=sharing optimizer=modeQchoose(role How do !ou track performance statistics and enhance it? *hrough 2onitor we can view the performance statistics What is the order of e*ecution done internall! in the transformer with the stage editor ha ing input links on the lft hand side and output links? #tage variables' constraints and column derivation or e)pressions
What are the difficulties faced in using Data(tage ? or what are the constraints in using Data(tage ? 3$If the number of loo&ups are more8 5$what will happen' while loading the data due to some regions job aborts8 Differentiate Database data and Data warehouse data? Data in a Database is a$ Detailed or *ransactional b$ /oth 1eadable and +ritable. c$ 4urrent. Dimension %odelling t!pes along with their significance Data 2odelling is /roadly classified into 5 types. a$ 0-1 Diagrams "0ntity - 1elatioships$. b$ Dimensional 2odelling. Data 2odeling3$ 0-1 Diagrams5$ Dimensional modeling 5.a$ logical modeling 5.b$!hysical modeling What is the flow of loading data into fact A dimensional tables? :act table - *able with 4ollection of :oreign Keys corresponding to the !rimary Keys in Dimensional table. 4onsists of fields with numeric values. Dimension table - *able with 7ni.ue !rimary Key. -oad - Data should be first loaded into dimensional table. /ased on the primary &ey values in dimensional table' the data should be loaded into :act table. Here is the se.uence of loading a datawarehouse. 3. *he source data is first loading into the staging area' where data cleansing ta&es place. 5. *he data from staging area is then loaded into dimensions(loo&ups. ?.:inally the :act tables are loaded from the corresponding source tables from the staging area What are (tage 1ariables3 Deri ations and Constants? #tage <ariable - An intermediate processing variable that retains value during read and doesnt pass the value into target column. Derivation - 0)pression that specifies value to be passed on to the target column. 4onstant - 4onditions that are either true or false that specifies flow of data with a lin&.
What is the default cache size? How do !ou change the cache size if needed?
Default cache size is 5DE 2/. +e can incraese it by going into Datastage Administrator and selecting the *unable *ab and specify the cache size over there. Default read cache size is 35G2/. +e can incraese it by going into Datastage Administrator and selecting the *unable *ab and specify the cache size over thereregardsjagan Containers : 'sage and 0!pes? 4ontainer is a collection of stages used for the purpose of 1eusability. *here are 5 types of 4ontainers. a$ -ocal 4ontainer6 Iob #pecific b$ #hared 4ontainer6 7sed in any job within a project. 4ontainer is a collection of stages used for the purpose of 1eusability. *here are 5 types of 4ontainers. a$ -ocal 4ontainer6 Iob #pecific b$ #hared 4ontainer6 7sed in any job within a project. [ *here are two types of shared container6[ 3.#erver shared container. 7sed in server jobs "can also be used in parallel jobs$.[ 5.!arallel shared container. 7sed in parallel jobs. >ou can also include server shared containers in parallel jobs as a way of incorporating server job functionality into a parallel stage "for e)ample' you could use one to ma&e a server plug-in stage available to a parallel job$. Compare and Contrast 9D&C and <lug#In stages? D/4 6 a$ !oor !erformance. b$ 4an be used for <ariety of Databases. c$ 4an handle #tored !rocedures. !lug-In6 a$ Lood !erformance. b$ Database specific." nly one database$ c$ 4annot handle #tored !rocedures. 0!pes of <arallel <rocessing? !arallel !rocessing is broadly classified into 5 types. a$ #2! - #ymmetrical 2ulti !rocessing. b$ 2!! - 2assive !arallel !rocessing types of llrlism . data llrlism pipeline llrlism round robin What does a Config ;ile in parallel e*tender consist of? 4onfig file consists of the following. a$ ,umber of !rocesses or ,odes. b$ Actual Dis& #torage -ocation. 4onfig file was read by datastage engine before running the job in !).
it consist of configuration about your server. e) nodes and all ;unctionalit! of )ink <artitioner and )ink Collector? -in& !artitioner 6 It actually splits data into various partitions or data flows using various partition methods . -in& 4ollector 6 It collects the data coming from partitions' merges it into a single data flow and loads to target. What is %odulus and (plitting in D!namic Hashed ;ile? In a Hashed :ile' the size of the file &eeps changing randomly. If the size of the file increases it is called as %2odulus%. If the size of the file decreases it is called as %#plitting%. *he modulus size can be increased by contacting your 7ni) Admin. 0!pes of ies in Datastage Director? *here are ? types of views in Datastage Director a$ Iob <iew - Dates of Iobs 4ompiled. b$ -og <iew - #tatus of Iob last run c$ #tatus <iew - +arning 2essages' 0vent 2essages' !rogram Lenerated 2essages. C98:9(: What is Cognos <owerhouse and what is it used for? 4ognos !owerhouse is High-!roductivity Application Development #olutions e.uips you with highproductivity development environments for creating your data-driven business solutions faster'whether for +eb-based' client(server' or traditional terminal-based access. !owerHouse has gained a worldwide reputation for productivity' reliability' performance' and fle)ibility. 8i e me an! e*ample of semi and non additi e measures,, #emi-Additive6 #emi-additive facts are facts that can be summed up for some of the dimensions in the fact table' but not the others. ,on-Additive6 ,on-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table. 4urrent /alance and !rofit 2argin are the facts.4urrent /alance is a semi-additive fact' as it ma&es sense to add them up for all accounts "whatMs the total current balance for all accounts in the ban&8$' but it does not ma&e sense to add them up through time "adding up all current balances for a given account for each day of the month does not give us any useful information$. !rofit 2argin is a non-additive fact' for it does not ma&e sense to add them up for the account level or the day level. what was the actual purpose of portfolio in cognos !ortfolio is li&e a summary page. 7sing portfolio u can demonstrate a generated report or survey to your clients. on one single portfolio you can place the lin&s of the report' some video clips' some images and many more items. try usng it... u will &now more...
*he 4ognos !ortfolio is ideal for presenting and pac&aging Impromptu reports' and for combining them with information from documents produced by other products.As an application developer' you can create briefing boo&s that contain lin&s to Impromptu1eports' !ower !lay reports' 0)cel spreadsheets' or any other -0 client application. 7se !ortfolio to set up briefing boo&s that let users view reports in a presentation style format. What is IJD?What is contained in IJD? How !ou create IJD In $eport:et ;rameWork? just -earned$I;DQImpromptu ;uery Definitions. I;D contains the .uery that is run againt the database to get the data. It also contains the database Information. I still dont &now how is is created in :rame+or& 3. 4reate a new namespace.Rgive it a name' say' I;D1eport,etP 5. 4reate a new .uery subject inside I;D1eport,et. ?.In the .uery subject definition window' pull in all the data you need. C. after creating' clic& on the new .uery subject' from the properties pane' select %I;D% from the drop down menu of e)ternalize2ethod. D ,0. to !ublish. u need to create a pac&age. 3. create a new pac&age. a +I\A1D will open. 5. ,ame the new !ac&age9 select the I;D1eport,et object from the project. ?. add security' select language' select the funtion list"D/$' select the location from the +I\A1D C. !ublish the pac&age. An .i.d file is typically created by Impromptu and contains the definition for a database .uery.!ower!lay *ransformer can access relational data sources via Impromptu ;uery Definition files.If Impromptu is not available' you can manually create an .i.d file to provide access to relationaldata sources.An Impromptu ;uery Definition describes a source table from a Data #ource. +hen you create amodel in *ransformer' the contents of the .i.d file are stored in the model and the embedded .i.dcontents are refreshed when the data source is opened "that is' when you generate categories orcreate cubes$. *he .i.d contents are also refreshed when you clic& K on the Data #ourceproperty sheet. Input values are retrieved from a supported database by e)ecuting an #;- .uerythat is stored in the .i.d file. How !ou drill from power<)a! to Impromptu?"*plain all (teps, goto tranformer' build the cube right clic& on the cube and open properties go to drill through tab and select your report from that browse option
How create measures and Demensions? >ou can create measure once you import all the data into the data source.>ou can create measures and dimensions by draging the re.uired source from datasource into dimension map and measure tab."need to find scope of measures for all the dimensions$ What are the Wa!s to Import Data into Catalong?=- wa!s> one through Database optionother one though hotfile option How do add d!namic titles in <<? powerplay:rom format menu--Ptitle' header V footer--Ptitle--Pinsert report filename what is the difference between group and associare? +hen you group data' Impromptu sorts the data item you are grouping and removes duplicatevalues. >ou can organize your report by grouping on one or more data items./y associating a grouped data item to another data item' you can eliminate duplicateinformation from your report.>ou can associate one or more data items with a grouped data item. An associated data itemshould have a one-to-one relationship with the grouped data item. :or e)ample' rder ,o. is agrouped data item. :or each order number' there is an rder Date. rder ,o. and rder Datehave a one-to-one relationship. what are the limitations of cognos reportnet? what are the enhancements in reportnet ? -imitations6+hen 1eport ,et concatenates strings locally and if any of the involved strings contain nullvalues' the result of the concatenation is an empty cell or a null value. *his occurs because1eport ,et re.uires that an e)pression that involves a null value return a null value.0nhancements6W 4reates a twolayer best practices :ramewor& 2anager modelW #upports /I #eries F single sign onW Does not re.uire installation and configuration of 4ognos 1eport ,et #DKW !rovides an update capability to the :ramewor& 2anager modelW interoperability with either 1eport ,et 3.3 or 4ognos G by installing and configuring theAppropriate #DK components W Includes regular and measure dimensions that can be used with 4ognos G #tudios please e*plain the different stages in creating a report in cognos report net /efore creating any report in report net ma&e sure you are have the planning information you need. 4reating a report involves following steps6 #pecifying a pac&age 4hoose a pac&age through cognos connectionPreport studioP#elect a module from the list. 4hoosing a report template 4reate a empty report first'clic& one of the predefined report templates 'clic& K. Adding .uery items In the insertable objects plane'select the .uery item that you want to add to report'drag it to the desired location. #aving report #ave the report from file menu. 1unning the report pen the report that you want to run. :rom the *ools menu' clic& <alidate 1eport.
A message bo) appears indicating whether any errors were found in the report. If you want to enable Design 2ode nly filters defined in the pac&age' from the :ile menu' clic& 7se 2odel Design :ilters. 7se these filters to limit the amount pen the report that you want to run. :rom the *ools menu' clic& <alidate 1eport. A message bo) appears indicating whether any errors were found in the report. If you want to enable Design 2ode nly filters defined in the pac&age' from the :ile menu' clic& 7se 2odel Design :ilters. If you want to set run options' from the 1un menu' clic& 1un ptions. :rom the 1un menu' clic& one of the options to produce the report in the format you want. >ou can produce a report in H*2-' !D:' 4#<' N-#' or N2-. what is Cognos 1isualizer and Cognos (cripting? <isualizer is a representation of data cubes in a dashboard format. +e can drill through to the ground level of a hierarchy as li&e in power play report but cannot add or remove fields dynamically. 4ognos script editor 6 +e can write cognos macros or programs in this tool and can fine tune or process some e)ecution. visualizer is trends to creating a reports 4ognos <isualizer64ognos <isualizer provides powerful visual analysis to communicate comple) business data.uic&ly and intuitively. /y going beyond simple pie or bar graphs' you can view and combinedata in a way that lets you immediately see areas of concern' trends' and other indicators.4ognos scripting6>ou can use the 4ognos #cript editor' or any other -0 compliant editor to create' modify'compile' and run macros.A macro is a set of commands that automates the running of an application. >ou can automatemost 4ognos /usiness Intelligence applications by using the 4ognos #cript language "a /A#I4-li&e language$ the 4ognos #cript 0ditor' and -0 automation what are different datasources to de elop models Work with (ource Data *ransformer supports a wide range of local data sources' including W Impromptu .uery definition files ".i.d$' which can .uery local or server-based databases W delimited A#4II files ".asc$ and comma-separated variable files ".csv$ W fi)ed-field te)t files W local databases' including 2icrosoft Access' which can specify #;- .ueries against local or server-based databases' 4lipper' d/ase' :o)!ro' and !arado) W !owerHouse portable subfiles W spreadsheet crosstabs and databases' including 0)cel and -otus 3-5-?
what are ersions of cognos from starting release to latest in the market I dont &now the starting release but can be version D' latest version is G.? then report,et 3.3. what are products of cognos 4ognos E.E F.J'F.?"!ower!lay' Impromptu$--1eport,et3.J'3.3mr3' 3.3mr55---1eport,et G.J"latest$I+1 is use by Impromptu to publish reports' !!0# is used by !!' 4ognos 4onnection is used by repornet. there are many other tools but these are the main. e*plain how to create powerpla! reports *o create a powerplay report 3$>ou have to create a I;D file from impromptu by saving I21 report as I;D 5$7se :rame +or& manager model and e)ternalise the .uery subject you want to use in powerplay transformer and powerplay +hen you have the I;D file published to a location such as 7,4 Lo to powerplay transformer model and select new ' select the data source name ' and I;D file location either published from :2 or Impromptu "saved as I;D$ >ou will see all the .uery items in the e)plorer then you can specify the dimensions li&e time and measures. *hen generate categories for the dimensions and measures . 1ight clic& the cube and select create power cube ' after that view the cube in cognos powerplay e)plorer. +e can also publish the cube to !!0#"!owerplay enterprise server$ !ublish the cube to the upfront 7se powerplay web e)plorer to view the cube what is model and sa! about process how to create model and how to test model %odel Definition 2odel in !ower!lay *ransformer is combiantion of five windows- Dimension 2ap'Data #ource' 2easures'!ower cube'#ign n. 2odels specify the data sources and then defining the dimensions' levels'and measures.*hey may also contain dimension views' user classes' user class views' and other authentication-related information. *he information for a model is stored in a model file' with the e)tension .pyi "binary' means platform depentent$ or .mdl "te)t'platform independent$. <rocess 0o Create 2 %odel
#teps 3. :rom the :ile menu' clic& ,ew to open the ,ew 2odel wizard. 5. In the 2odel ,ame bo)' enter a name for your model. ?. *o control data access by means of user class authentication' select the Include #ecurity in this 2odel chec& bo). C. 4lic& ,e)t and enter the name of initial data source in the Data #ource ,ame bo). D. #elect a #ource *ype from the list' and clic& ,e)t. E. Depending on the selected type' *ransformer shows different options. 0nter the parameters that correspond to your data source. F. n the last page of the ,ew 2odel wizard select the 1un AutoDesign option to have *ransformer create a preliminary structure ' or by clearing this chec& bo) users own degisn can be created G. +hen clic& :inish. *ransformer opens a new model based on specified source and shows you information about the source data. 0esting 2 %odel 2odel can /e *ested /y chec& model option in *ool menu.
what t!pe problems we can face in general at report runnig time the most common problems are 3.,o Data Appears in the 1eport "to remove this chec& data source or pac&age$ 5.7ne)pected or incorrect values appears in reports ' may be report is running with limited data ?.*he report filter does not wor&' values are not coming in filter option C.report is not able to oprm in e)cel 'cvs or )ml how can create users and permissions in cognos In impromptu menu go to 4atlog-P7ser profiles -P userclass tab -P clic& on 4reator -Pu can give there folder access'table access'filters'Lovernor etc how can i test reports in cagonos In cognos report net by the validate report option a report can be tested. If there will be any error' it will specify the the error' unless it will give message -Mreport specification is validM. how can i schedule reports in cognos /y using 4ognos #chedular' one can schedule the running reports in Impromptu to e)ecute and save it in desired format. /y using 4ognos 2A41 script language the reports can be e)ecuted and distributed to recipients by using mail applications. 4ompiled 4ognos 2A41 s can be scheduled using 4ognos #chedular.
what is catalog and t!pes of catalogs in cagnos and which is better A catalog is a file containing the information "Database tables$ that Impromptu users need to create reports. types of catalogs we have in cognos +,<ersonal Catalog: nly one user"4reator$ can create(modify a catalog V report. -,(hared Catalog: nly one user"4reator$ can create(modify a catalog. /ut any body can create their own reports using this catalog. .,Destributed Catalog: Here any body can change their own %personal distributed catalogs' they can create their own reports. /ut no one can change the master distributed catalog. if u ma&e any changes in master distributed catalog the changes will be effected to !ersonal distributed catalogs. /,(ecured catalog: ,o one can change the catalog V 1eport it is fully secured Here better +e can choose the Distributed 4atalog How to perform (ingle (ignon in $eport:et b! using '$)?
how many number of cubes can we create on a single model? how can we navigate between those cubes?
what is the difference between a cascading report and drillthru report?wh! do we go for drill thru report? cascading report wor&s based on the condition but drill thru wor& based on the data item what we select as a drill thru options What is meant b! Bunk Dimension? ans6- a diamension which does not changes the grain level is called jun& diamension. grain- lowest level of reporting %*he jun& dimension is simply a structure that provides a convenient place to store the jun& attributes% A jun& dimension is a convenient grouping of flags and indicators. What is meant b! Bunk Dimension? How do !ou perfom while running the report? Where will !ou see the time of running report? (ize of the cube? What are the t!pes of prompts in $eport:et? What is macro and how it will be work? What is difference between Cognos and Cognos $eport:et?
How do !ou create cube in $eport:et? 0here are +5 facts are there,How will !ou connect all? the main diff b(w cognos V 1eport-net is ' connection to the single database in cognos and in the reportnet multi-databases can be connected. it is a web based toll Is - ! 4 ,#*1AI,*# occure in 4 L, # ' if yes how to resolve that and please tell me how the loops occure in cognos8 $ !roblem Description +hat are looping joins and how do you resolve them8 #olution Description -ooping joins could potentially return incorrect data. An e)ample of a looping join' / -P 4 A -P R P -P D 0 -P : +hen you select an item from table A and D' Impromptu will try to choose the shortest path' e.g. if A -P D e)isted' then this is the path Impromptu will ta&e. /ut in the above situation' the two paths are e.ual and so Impromptu has to ma&e a choice' e.g. %A -P / -P 4 -P D% 1 %A -P 0 -P : -P D%. Impromptu ma&es itMs choice based on how the catalog was constructed' which cannot be altered once itMs created9 order of the tables in the catalog. *he two paths could return different results depending on the relationship between the tables in the path. *he report would be fine I: Impromptu chose the e)pected path. *he choice is not always right. 0liminating looping joins prevents Impromptu from ma&ing the wrong choice. *o eliminate looping joins' you can either brea& unnecessary joins' e.g. reports do not need a join between table : and D' e.g. / -P 4 -P D A -P R 0 -P : /ut if you need all the joins' use A-IA# tables to brea& the looping join. Add an alias table for table A and brea& the join from table A and 0' e.g. A -P / -P 4 P -P D Alias A -P 0 -P : /oth solutions could affect e)isting reports. 5$ *itle6 4reated6 -ooped joins ,ov JC' 3HHH
Applies *o6
Impromptu - 5.J Impromptu - ?.J3 Impromptu - ?.J? Impromptu - ?.JC Impromptu - ?.D Impromptu - C.J Impromptu - D.J Impromptu - F.3
!roblem Description 7nder the Ioins dialog on the Analyze *ab it states a -oop join is present. +hat does this mean and how can it be resolved8 #olution Description A -oop Ioin occurs when there are multiple paths between database tables. An e)ample of this is A joins to / and / joins to 4 and 4 joins to A. *he proper definition of join strategies in an Impromptu catalog is crucial to the success of an ad-hoc reporting environment. Impromptu shelters the user from having to &now any of the technical information about the database' including name' location' table and column names' and join strategies. *he Impromptu Administrator must be very thorough in their definition and testing of the join strategies. Impromptu provides an ability to analyze the joins and determine any anomalies. *he most common is the -oop Ioin. *he implications of the loop join is that there is no way to predetermine which of the various join paths will be used by Impromptu when creating the #;-. #;- is dynamically generated for each report as it is created and before it e)ecutes. :or e)ample' to create a report using columns from tables A and 4' we could join from AQP/QP4 or directly from AQP4. In some cases' both of these joins would result in the same data being retrieved. However' in other cases it may result in different data. Impromptu will always try to use the shortest route in joining multiple tables. It will also try to use the tables that are already included in the .uery' rather than including an additional table. *here is no hard and fast rule to resolving -oop Ioins. *here are four basic resolutions6 3. /rea& the join 5. 4reate alias tables with different join strategies ?. 7se the join e)pression editor to specify the join C. 2odify #;0ach of these resolutions is done for a different reason and may have some issues associated with it. Determine the best resolution for your situation by analyzing the data with regards to the results re.uired from the join structure. 0)ample6 *he join structure loo&s li&e this6 AQ/ AQ4 /Q4 *his is producing incorrect results. *o resolve this issue' ma&e table 4 an alias to omit the
loop in the join structure and this will result in data displaying correctly. 4orrect Ioin #tructure6 AQ/ A Q 4 alias /Q4 use this Euer! i am retri ing all !ears 9Ct data from 5+#+5#-55/ to .5#+5#-55G i need to restrect this Euer! to current date and current !ear OgosalesFgoretailersP,O9rdersP,O9rder monthPbetween 5+ and toFnumber=toFchar=currentFdate3I%%I>>and OgosalesFgoretailersP,O9rdersP,O9rder monthPQtoFnumber=toFchar=currentFdate3I%%I>> pass polar ID >ou have a function called Me)tractM in cognos. 0)6- e)tract"month'the dte field$. by giving li&e this you will get month. so you can &eep a filter to restrict the rows only for october. How to show the data reported horizontall!: =;or e*ample:> emplo!ee skill +a +b +c -d -e -f $eport result: + abc - def Assuming ? records per grouped item6 3. Lroup on employee 5. 4reate a running count based on the s&ill field. ?. 4reate ? calculated columns based on the count field. 4all them s&ill3' s&ill5' s&ill?6 if "count Q 3$ then "s&ill$ else null if "count Q 5$ then "s&ill$ else null if "count Q ?$ then "s&ill$ else null C. 4reate ? more calculated columns using the ma)imum function. 4all them ma)s&ill3' ma)s&ill5' ma)s&ill? ma)imum "s&ill3$ ma)imum "s&ill5$
ma)imum "s&ill?$ D. Lroup on employee on ma)s&ill3 on ma)s&ill5 on ma)s&ill? E. report employee a)s&ill3 ma)s&ill5 ma)s&ill? How to pass multiple alues from picklist prompt to sub report filter *he sub-report only includes the first value. %+hen the sub-report .uery runs' it chec&s for the first row in the 4ustomer ,ame column and shows only information for that customer. If you want a sub-report to show information for another row in the column' place the main report in a form frame that shows only one row at a time. +hen you insert the sub-report into the form frame as well' it changes as you clic& through the rows in the main report. :or e)ample' the main and sub-report above are both in a form frame that shows only one row of the 4ustomer ,ame column at a time. 0ach time you scroll to another customer name' the sub-report shows only information for that customer.% How can I create a d!namic column name in Cognos 3.4reate a calculated column which contains the information that the header is to contain' such as %1eport for year 3HHH% "concatenated te)t and date to string sub string e)traction$. 5.Highlight the report' and then right-clic&. ?.#elect !roperties' and then clic& the Headers(:ooters tab. C.4lear the 4olumn *itle Header chec& bo). *his will remove the headers from your columns. D.1einsert the rest of the column headers9 insert te)t will wor&. :or the dynamic column' from the Insert menu' clic& Data and select the calculated column you created and insert it into the report. As you said .#elect !roperties' and then clic& the Headers(:ooters tab. I didnt find the option Headers(:ooters tab. +hich version you are tal&ing about 88 use nested if(else or case structure in calculated columns or use conditional formating to show dynamically the column names C98:9( $"<9$0 :"0: What is parameter mapping e*plain with an e*ample, what is difference b4w modelApackage? $eport (tudio allows !ou to create the data model from a Euer! and thereb! skipping ;ramework %anager, 8i en that this feature is a ailable3 in what scenarios would one want to skip ;ramework %anager? Is ;ramework %anager an un#needed o erhead? Actually the it is not possible to ma&e reports without pac&ages and pac&ages can be build in :2. #o no way one can s&ip :2 and create reports.Adding to it'we need metadata "created in :2$ to create relations among tables or views and also creating objects which satisfies reporting need. What are the ad antages and disad antages of reporting directl! against the database?
Do !ou alwa!s need to cop! the data before reporting on it?=e*ample3 real#time A on#demand reporting is a reEuirement> What function should !ou use to displa! the alue entered or selected b! a user in response to a prompt? +e have a function called param Display <alue. /y using this function we can display the parameter value selected in prompt. Lour client wants to ensure onl! specific users can create reports in Juer! (tudio with other users creating reports using $eport (tudio, How can this be accomplished? it isnMt possible create users for different studios'it is possible only at cognos connection What step=s> must be completed to allow a single instance of ;ramework %anager to connect to multiple $eport:et dispatchers? How to generate IJD file from framework manager 4reate a ;uery #ubject' from the properties pane select e)ternalise'there we have C options in that select I;D What are the arious file formats in ol ed in reportnet? It has si) "E$ formats in report net. *hey are H*2-' !D:' 0)cel 5JJJ' 0)cel 5JJ5' 4#<' and N2format. +e can see the types of formats in the report viewer on the right side . how can i test report in reportnet we can test the datas in the database by using .ueries and we can use tabular s.l in reportnet for datas and validate the report for reportnet If we wanna test the report in report net' first we can intially chec& by validating it in the report page. After that we can test the out put of the report 7sing a s.l anlyser and s.l .uery.so here we will be comparing the s.l analyzer output with the output of the report viewer.

Differences Between Inmon and Kimball Data Warehousing Philosophies

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Differences Between Inmon and Kimball Data Warehousing Philosophies

Uploaded by

Copyright:

Available Formats

Data ware Housing Concepts: What is the main difference between Inmon and Kimball philosophies of data warehousing

What is ersion Control? <ersion 4ontrol

You might also like