- Ab Initio is a Business Intelligence platform comprised of six data processing products: CoOperating System, The Component Library, Graphical Development Environment, Enterprise MetaEnvironment, Data Profiler, and ConductIt. It is a powerful GUI-based parallel processing tool for ETL data management and analysis. CoOperating System.
- Ab Initio vs. In terms of general functionality, that is, the ability to do transformations, access sources and targets, work in real time, etc., both Ab Initio and Informatica are equivalent. Ab Initio has an edge in raw processing efficiency, providing very high levels of throughput.
- Sep 24, 2018 Ab Initio specializes in high-volume data processing applications and enterprise application integration. The Ab Initio products are provided on a user friendly homogeneous and heterogeneous platform for parallel data processing applications.
Ab initio Interview Questions And Answers
So you have finally found your dream job in Ab initio but are wondering how to crack the Ab initio Interview and what could be the probable Ab initio Interview Questions for 2018. Every interview is different and the scope of a job is different too. Keeping this in mind we have designed the most common Ab initio Interview Questions and Answers for 2018 to help you get success in your interview.
Below is the top Ab initio Interview Questions that are asked frequently in an interview. These Interview questions are divided into two parts are as follows:
Part 1 – Ab initio Interview Questions (Basic)
Ab Initio: Stellar Astrophysics and an Ab Initio Description of Thermonuclear Reactions The central goal is to enable the ab initio computation of chemically accurate barrier heights for reactions with metal surfaces of catalytic interest.
This first part covers basic Ab initio Interview Questions and Answers.
Hadoop, Data Science, Statistics & others
1. What are the components or functions available in ab initio?
Answer:
The main components in ab initio are here below,
The main components in ab initio are here below,
Component | Purpose |
Dedup | To remove duplicates |
Join | To join multiple input dataset based on a common key value. |
Sort | This component reorders the data. It takes the collation order and dumps data to memory |
Filter | Any conditional related removal of data. |
Replicate | This is component is mainly for the parallelism as an additional copy of data is useful while any other nodes go unavailable. |
merge | This component is to combine multiple input data. |
2. What are the types of parallel processing?
Answer:
This is the common Ab initio Interview questions asked in an interview. Different types of parallel processing are,
This is the common Ab initio Interview questions asked in an interview. Different types of parallel processing are,
- Component parallelism
- Data parallelism
- Pipeline parallelism
Component parallelism: An application that has multiple components running on the system simultaneously. But the data are separate. This is achieved through component level parallel processing.
Data parallelism: Data is split into segments and runs the operations simultaneously. This kind of process is achieved using the data parallelism
All in One Data Science Bundle (360+ Courses, 50+ projects)360+ Online Courses | 1500+ Hours | Verifiable Certificates | Lifetime Access
4.7 (3,220 ratings)
4.7 (3,220 ratings)
Related Courses
Data Scientist Training (76 Courses, 47+ Projects)SAS Training (9 Courses, 8+ Projects)Machine Learning Training (17 Courses, 11+ Projects)AWS Training (9 Courses, 3+ Projects)Pipeline parallelism: An application with multiple components but running on the same dataset. This uses pipeline parallelism.
3. What is the different way to achieve the partitions?
Answer:
There are multiple ways to do the partitions.
There are multiple ways to do the partitions.
Partitions | Description |
Expression | Data split according to the data manipulation language. |
Key | Grouping the data by specific keys |
Load balance | Dynamic load balancing |
Percentage | Segregate the data where the output size is on the fractions of 100 |
Range | Split the data evenly based on a key and a range among the nodes |
Round robin | Distributing the data evenly in blocksize across the output partitions. |
Let us move to the next Ab initio interview Questions.
4. What is a multifile system?
Answer:
Multifile is a set of directories on different nodes in a cluster. They possess an identical directory structure. The multifile system leads to a better performance as it is parallel processing where the data resides on multiple disks.
Multifile is a set of directories on different nodes in a cluster. They possess an identical directory structure. The multifile system leads to a better performance as it is parallel processing where the data resides on multiple disks.
It is created with the control partition on one node and data partitions on the other nodes to distribute the processing in order to improve the performance.
5. Difference between Hadoop and Ab initio?
Answer:
Hadoop | Ab initio |
Open source | Proprietary software |
Parallel processing through mappers and reducers | Parallel processing architecture |
Any variety of data is best suited here | Best for traditional EDW implementations |
Fault tolerance is achieved | Fault tolerance is not achieved |
MapReduce is controlled on any components or functions | Components like join, group, sort are easily performed |
Cheap as its open source and can try out any business use cases. | Expensive and the applicable on a high values business case because of the cost |
Loosely coupled components where custom functions are built | Tightly coupled between the components as they are highly recommended based on the business use case. |
Part 2 – Ab initio Interview Questions (Advanced)
Let us now have a look at the advanced Ab initio Interview Questions.
6. What kind of layouts does Ab initio support?
Answer:
- Supports serial and parallel layouts.
- A graph layout supports both serial and parallel layouts at a time.
- A multi-file system is a 4-way parallel system
- A component in a graph system can run 4-way parallel system.
7. What is the relation between Enterprise metadata environment (EME), the Graphical development environment (GDE) and co-operating system?
Answer:
CoOperating System: It operates on top of the operating system and this is provided by the ab initio and it the base for all Ab Initio processes. Air commands are one of the features that can be installed on different operating systems like UNIX, Linux, IBM etc
CoOperating System: It operates on top of the operating system and this is provided by the ab initio and it the base for all Ab Initio processes. Air commands are one of the features that can be installed on different operating systems like UNIX, Linux, IBM etc
These are the following features that it provides,
– Manages and runs Ab Initio graphs and control the ETL processes
– Providing the extensions
– ETL processes monitoring and debugging
– Metadata management and interaction with the EME
– Manages and runs Ab Initio graphs and control the ETL processes
– Providing the extensions
– ETL processes monitoring and debugging
– Metadata management and interaction with the EME
GDE: It’s a designing component and used to run the ab initio graphs.
Graphs are formed by the components (predefined or user-defined) and flows and the parameters. It provides the ETL process in Ab Initio that is represented by graphs.
Ability to run, debug the process logs jobs and trace execution logs
Enterprise Meta-Environment (EME): It’s an environment for storage and also metadata management (Both business and technical metadata). The metadata is accessed from the graphical development environment and also the web browser or the cooperating command line. It is ab initio repository for any placeholders.
Let us move to the next Ab initio interview questions.
What Is Ab Initio Software Free
8.How data is processed and what are the fundamentals of this approach?
Answer:
There are certain activities which require the collection of the data and the best thing is processing largely depends on the same in many cases. Before processing the data it has to reside on some placeholder like a well-defined storage. This task depends on some major factors are they are
There are certain activities which require the collection of the data and the best thing is processing largely depends on the same in many cases. Before processing the data it has to reside on some placeholder like a well-defined storage. This task depends on some major factors are they are
1. Collection of Data
2. Presentation
3.Final Outcomes
4.Analysis
5.Sorting
2. Presentation
3.Final Outcomes
4.Analysis
5.Sorting
9. What is the difference between partitioning with key and round robin?
Answer :
This is the advanced Ab initio interview questions asked in an interview. Partition by key
In this, we have to specify the key based on which the partition will occur. It results in well-balanced data due to the key based partitions. It is useful for key dependent
parallelism.
Partition by round robin: In this, distributing data evenly in block size chunks the records are partitioned in a sequential way across the output partition. It is not key
based and results are well-balanced data especially with a block size of 1. It is useful for
record independent parallelism.
This is the advanced Ab initio interview questions asked in an interview. Partition by key
In this, we have to specify the key based on which the partition will occur. It results in well-balanced data due to the key based partitions. It is useful for key dependent
parallelism.
Partition by round robin: In this, distributing data evenly in block size chunks the records are partitioned in a sequential way across the output partition. It is not key
based and results are well-balanced data especially with a block size of 1. It is useful for
record independent parallelism.
10. How do you improve the performance of a graph?
Answer:
There are many ways the performance of the graph can be improved.
1) Reduce the usage of multiple components on certain phases.
2) Use a refined and well defined value of max core values for sort and join components
3) Minimize the use of regular expression functions like re_index in the transfer functions
4) Minimize sorted join component and if possible replace them by in-memory join/hash join
5) Use only required fields in the sort, reformat, join components
6) Using Phase or the flow buffering during the cases of merge or sorted joins
7) Use hash join if the two sets of input is small else better to choose the sorted join for the huge input size
8) For large dataset better not use broadcast as partitioned
9) Reduce the number of sort components while processing.
10) Avoid repartitioning of data unnecessarily
There are many ways the performance of the graph can be improved.
1) Reduce the usage of multiple components on certain phases.
2) Use a refined and well defined value of max core values for sort and join components
3) Minimize the use of regular expression functions like re_index in the transfer functions
4) Minimize sorted join component and if possible replace them by in-memory join/hash join
5) Use only required fields in the sort, reformat, join components
6) Using Phase or the flow buffering during the cases of merge or sorted joins
7) Use hash join if the two sets of input is small else better to choose the sorted join for the huge input size
8) For large dataset better not use broadcast as partitioned
9) Reduce the number of sort components while processing.
10) Avoid repartitioning of data unnecessarily
What Is Ab Initio Software
Recommended Article
This has been a guide to List Of Ab initio Interview Questions and Answers so that the candidate can crackdown these Ab initio Interview Questions easily. Here in this post, we have studied about top Ab initio Interview Questions which are often asked in interviews. You may also look at the following articles to learn more –
Ab Initio is suite of applications containing the various components, but generally when people name Ab Initio, they mean “Ab Initio Co>operation system”, which is primarily a GUI based ETL Application. It gives user the ability to drag and drop different components and attach them, quite akin to drawing.
The strength of Ab Initio-ETL is massively parallel processing which gives it capability of handling large volume of data.
Let’s componentise Ab Initio
1. Co>operation System
2. EME(Enterprise Meta>Environment)
3. Additional Tools
a. Data profiler
b. Plan-IT etc
Co>operating System is ETL application; it comes packaged with EME (mentioned in next paragraph). This is GUI based application. Quite simple is design due to drag and drop features, most of the features are quite basic and so basic learning curve is quite steep. Now it has further two flavour or sub classes:
Debian install from windows xp. With this later on you can install Debian on your PC from your USB drive.Tools used in this video and required to install first:Rufus -BitTorrent -Debian -NOTE - You can use other torrent client for this if you already have it installed such as utorrent.NOTE - With Rufus you can make other linux and windows systems bootable on USB drive as well.For those who need a detailed step by step tutorial, I also wrote it.
1. Batch Mode
2. Continuous Flow
The both primarily doing the similar things, but classically different in mode of processing as the name suggested. The “Batch Mode” is primarily used by most of costumer gives the benefit of moving bulk data (daily/multiple times a day).
Continuous mode is more like “Click/Trigger” driven; say when you click on a web page the data flow starts, some of very large web based application run on Ab Initio server using Continuous flow
EME is more like source control for Ab Initio, but it has many additional features like
1. Meta data management
a. Business Metadata management
b. Process metadata management
2. Impact Analysis
3. Documentation tools
Kids bible stories free downloads. You can download apps/games to desktop of your PC with Windows 7,8,10 OS, Mac OS, Chrome OS or even Ubuntu OS.
4. Run History Tracking
5. And surely Check-in and check-out
Ab Initio has come up with certain other application to complement the ETL suite; I will not be covering these in details, just one liner
Data profiler – It is data profiling tool, got the features for data quality analysis
Plan-IT – It is primarily a scheduler built by Ab Initio to run Ab Initio jobs. It can be integrated with Ab Initio jobs.
Ab Initio Software Download
Categorising Ab Initio and labelling strength and weakness based on the following criteria. I am giving each section points from 1 to 10(10 being best) based on my own experience and some reference from web.
1. Cost to purchase(4)
2. Total Cost of Ownership(4)
3. Platform (OS and DBMS)(8)
4. Ease of use (wizards, drag & drop, etc)(8)
5. Learning curve(6)
6. Performance(9)
7. Available expertise(7)
8. Ab Initio Support and Other Resources(5)
Cost of Purchase – It is one of the costliest ETL tool in the market, with cost ranging from 500k to 5M, it depends upon the number of servers Ab Initio is installed, number of developer license and type of license, and batch flow is comparatively cheaper than continuous flow.
Comparing it with other major ETL tools like Informatica with similar functionalities, the pricing difference will be clearly evident
Total Cost of Ownership – The cost of ownership comes in 3 parts
a) Annual maintenance charges
b) Cost of employing/training Ab initio resources
c) Development cost
Annual Maintenance charges – It is generally the percentage of initial cost and it is significant due to high initial cost. This number may differ based on the NDA and initial investment. A rough 10% maintenance charges is significant outflow.
Development cost – covered under training and resources
Available Expertise/Ab Initio Resources (training/employing) -
It is high end tool, so the developer community is not massive like many open source application, so employing these resources come with premium price.
Additionally Ab Initio is such a close community, so if you are ETL developer and want to explore/learn Ab Initio generally you will hit a wall and as I recall really there are 2 options
1. Work for an Organisation who own Ab initio
2. There are only handful of organisation who train in Ab Initio, so pay a premium to join the club
Platform – Like most of other ETL tools, it can work in various platforms.
On Database front, it can connect to all the major databases available in the market. So there is nothing to choose between this tool with respect to others. It allows connection to DB either by ODBC client or native mode, I believe some of other ETL tools may not have native mode supported
Ease of use – Being GUI based it is easy to use, simple component s, drag and drop and various indicators if connections are not completely made. In comparison to other tools, there is nothing much to chose in that end.
Creating custom based components and re-using those is one feature, I really liked in Ab Initio
Sep 15, 2019 Adobe Photoshop 7.0 Free Download Overview Adobe Photoshop 7.0 is a Top Ranked photo editing software. It has a set of tools and excellent features for drawing, designing and editing images with high efficiency. With simple mouse clicks, you can select your photos, start editing and add special effects to them. Adobe Photoshop 7.0 Free Download Setup for PC. Adobe Photoshop is available in a single click download option. Enjoy unlimited professional photo editing with Adobe workspace. Adobe Photoshop 7.0 Product Review: Adobe Photoshop 7.0, although a bit older version but there are solid reasons to choose 7.0 over advanced series. Adobe photoshop 7.0 download for pc windows 8. Dec 08, 2017 Adobe Photoshop 7.0 utilizes vector graphics to keep the high quality for web image upload. There was a big improvement for the users who design a web application in it. Adobe Photoshop 7.0 got the Adobe ImageReady 7 which actually plays all. Photoshop 7.0’s basic Adobe feel and look, finish with drop-down combinations and menu choices, continue to be fairly unmodified. Yet Adobe has introduced a few trendy enhancements, including the convenient Device Presets option, which lets you transform as well as conserve custom-made parameters for any type of device to a quick-access palette. Feb 14, 2019 Adobe Photoshop 7.0 free download utility is one of the leading graphics editing programs which has many advanced features to enhance image quality. Free Photoshop 7 incredibly features rich image editing software and it’s been around for a long time. Some of the important features of Photoshop 7.0 are as follows.
There are certain set of components which are difficult to use and may require bit of scripting experience, but it is okay
Learning Curve – Learning curve is quick to start with; the difficult components can take some time. Ab Initio has designed certain components very cleverly, it takes bit of experience to utilize those optimally and take bit of time. I guess learning time of about 15 man days for a programmer with about 2-3 years experience will give enough fluency in designing application
Part of learning curve is covered in next section
Ab Initio Support and other resources –
Covering both topics in one go – Ab Initio is treating their application like a fort/sacred book, with little information and literature available on the market, so as a consequence, there is not enough information material on web.
1. Not enough resources on web
2. Hit productivity of team, when struck with a technical/design issue
3. Unavailability of training material hurt training new employees
Just like any application/scripting language, there is potential of not using the application optimally, I really believe lack of proper training and open discussion has hit Ab Initio developers really hard, where they missing and still groping in dark having following set of problems
1. No Access or standard set of best practices. Organisation tend to have their own best practices if any and under tough external review generally these will fall much short of best
2. Missing Input from other communities having parallel functionalities and smaller developer pool restricts better ideas/inputs
Though Ab Initio provides training for users (costumers), but it does not cover each and every aspect of Ab Initio and advanced training comes with a cost,
Ab Initio provide support to their customers, it is of decent quality, but it takes generally long turnaround time
Performance –
I left this topic at end, as this is the reason Ab Initio is used
For massive data processing, where time is of essence and performance and through put is critical, Ab Initio stands head and shoulder above others.
1. Massively parallel Architecture
a. Available data can be split and processed in parallel giving it huge processing advantage.
b. Theoretically, it is possible to design a system using Ab Initio architect where any additional processing power can be achieved by adding additional resources in parallel, thus allowing any scale-up easy and possible
2. Innovative component
a. Ab Initio components like compressed indexed files and similar gives Ab initio an edge when dealing with huge dataset. Though this concept is not unheard of, in past, but Ab Initio implemented it successfully.
b. Some new scripting features known as PDL (Program definition Language) in Ab Initio allows flexibility, which is quite well received by Ab Initio developers and not easily available in other ETL tools.
c. Personal perception: Ab Initio has put some effort in component design taking care of small issues like memory management/memory foot print. Though these are not critical essentially, but in time critical system, these provide an edge.