Persistent data in the field of data processing denotes information that is infrequently accessed and not likely to be modified. Static data is information, for example a record, that does not change and may be intended to be permanent. It may have previously been categorized as persistent or dynamic.
2. Explain the terms: Data, Database, Database Server, and Database Management System.
Database
A database (DB), in the most general sense, is an organized collection of data. More specifically, a database is an electronic system that allows data to be easily accessed, manipulated and updated.
In other words, a database is used by an organization as a method of storing, managing and retrieving information. Modern databases are managed using a database management system (DBMS).
Database ServerIt is similar to data warehouse where the website store or maintain their data and information. A Database Server is a computer in a LAN that is dedicated to database storage and retrieval. The database server holds the Database Management System (DBMS) and the databases. Upon requests from the client machines, it searches the database for selected records and passes them back over the network.
Database Management System
A database management system (DBMS) is system software for creating and managing databases. The DBMS provides users and programmers with a systematic way to create, retrieve, update and manage data.
3. Compare Files and Databases, discussing pros and cons of them.
Although database management systems all perform the same basic task, which is to enable users to create, edit and access information in databases, how they accomplish this can vary. Additionally, the features, functionality, and support associated with each management system can differ significantly.
When comparing different popular databases, you should consider how user-friendly and scalable each DBMS is as well as how well it will integrate with other products you’re using. Additionally, you may want to take into account the cost of the management system and the support available for it.
pros- You’ll find the latest innovations and features coming from their products since Oracle tends to set the bar for other database management tools.
- Oracle database management tools are also incredibly robust, and you can find one that can do just about anything you can possibly think of.
- The cost of Oracle can be prohibitive, especially for smaller organizations.
- The system can require significant resources once installed, so hardware upgrades may be required to even implement Oracle.
The term data access arrangement (DAA) has the following meanings: In public switched telephone networks, a single item or group of items at the customer side of the network interface for data transmission purposes, including all equipment that may affect the characteristics of the interface.
3 type arrangements
Un-structured
Unstructured data is information that either does not have a pre-defined data model or is not organized in a pre-defined manner.Unstructured information is typically text-heavy, but may contain data such as dates, numbers, and facts as well.
Semi-structured
Semi-structured data is a form of structured data that does not conform with the formal structure of data models associated with relational databases or other forms of data tables, but nonetheless contains tags or other markers to separate semantic elements and enforce hierarchies of records and fields within the data. Therefore, it is also known as self-describing structure.
Structured
structured data include numbers, dates, and groups of words and numbers called strings. Most experts agree that this kind of data accounts for about 20 percent of the data that is out there. Structured data is the data you're probably used to dealing with. It's usually stored in a database.
5. Explain different types of databases, providing examples for their use
- Connection
- Statement
- Reader
- Result set
6. Compare and contrast data warehouse with Big data
3Vs (volume, variety and velocity) are three defining properties or dimensions of big data. Volume refers to the amount of data, variety refers to the number of types of data and velocity refers to the speed of data processing. According to the 3Vs model, the challenges of big data management result from the expansion of all three properties, rather than just the volume alone -- the sheer amount of data to be managed.
7. Explain how the application components communicate with files and databases
About Connecting BRM Components
To allow BRM components to communicate with each other, you use entries in configuration or properties files. The basic connection entries in the files identify the host names and port numbers of each component.
These connection entries are set when you install BRM and when you install each client application. You can change them if you change your configuration. Depending on how you install BRM, you might have to change some entries to connect BRM components.
8. Differentiate the SQL statements, Prepared statements, and Callable statements
SQL statements
Callable statements
The JDBC Statement, Callable Statement, and Prepared Statement interfaces define the methods and properties that enable you to send SQL or PL/SQL commands and receive data from your database.Useful when you are using static SQL statements at runtime. The Statement interface cannot accept parameters.
Example
9. Argue the need for ORM, explaining the development with and without ORM
Object-Relational Mapping (ORM) is a technique that lets you query and manipulate data from a database using an object-oriented paradigm. When talking about ORM, most people are referring to a library that implements the Object-Relational Mapping technique, hence the phrase "an ORM".
An ORM library is a completely ordinary library written in your language of choice that encapsulates the code needed to manipulate the data, so you don't use SQL anymore; you interact directly with an object in the same language you're using.
For example, here is a completely imaginary case with a pseudo language:
You have a book class, you want to retrieve all the books of which the author is "Linus". Manually, you would do something like that:
book_list = new List();
sql = "SELECT book FROM library WHERE author = 'Linus'";
data = query(sql); // I over simplify ...
while (row = data.next())
{
book = new Book();
book.setAuthor(row.get('author');
book_list.add(book);
}
With an ORM library, it would look like this:
book_list = BookTable.query(author="Linus");
The mechanical part is taken care of automatically via the ORM library.
10. Discuss the POJO, Java Beans, and JPA, indicating their similarities and differences
POJO (Plain Old Java Object): A Plain Old Java Object or POJO is a term initially introduced to designate a simple lightweight Java object, not implementing any javax.ejb interface, as opposed to heavyweight EJB 2.x (especially Entity Beans, Stateless Session Beans are not that bad IMO). Today, the term is used for any simple object with no extra stuff.
JavaBeans: JavaBeans are reusable software components for Java that can be manipulated visually in a builder tool. Practically, they are classes written in the Java programming language conforming to a particular convention. They are used to encapsulate many objects into a single object (the bean), so that they can be passed around as a single bean object instead of as multiple individual objects. A JavaBean is a Java Object that is serializable, has a nullary constructor, and allows access to properties using getter and setter methods.
Enterprise JavaBeans (EJB) is a managed, server software for modular construction of enterprise software, and one of several Java APIs. EJB is a server-side software component that encapsulates the business logic of an application.
Hibernate
Hibernate is an object-relational mapping (ORM) library for the Java language, providing a framework for mapping an object-oriented domain model to a traditional relational database. Hibernate solves object-relational impedance mismatch problems by replacing direct persistence-related database accesses with high-level object handling functions.
Features of Hibernate:
- Transparent persistence without byte code processing
- Object-oriented query language
- Object / Relational mappings
- Automatic primary key generation
IBatis / MyBatis
iBATIS is a persistence framework which automates the mapping between SQL databases and objects in Java, .NET, and Ruby on Rails. In Java, the objects are POJOs (Plain Old Java Objects). The mappings are decoupled from the application logic by packaging the SQL statements in XML configuration files. The result is a significant reduction in the amount of code that a developer needs to access a relational database using lower level APIs like JDBC and ODBC.
Features of IBatis:
- Support for Unit of work / object level transactions
- In memory object filtering
- Providing an ODMG compliant API and/or OCL and/or OPath
- Supports multiservers (clustering) and simultaneous access by other applications without loss of transaction integrity
Toplink
In computing, TopLink is an object-relational mapping (ORM) package for Java developers. It provides a framework for storing Java objects in a relational database or for converting Java objects to XML documents
Features of Toplink:
- Query framework that supports an object-oriented expression framework, Query by Example (QBE), EJB QL, SQL, and stored procedures
- Object-level transaction framework
- Caching to ensure object identity
- Set of direct and relational mappings
Benefits
- Large volumes of structured, semi-structured, and unstructured data.
- Agile sprints, quick iteration, and frequent code pushes.
- Object-oriented programming that is easy to use and flexible.
- Efficient, scale-out architecture instead of expensive, monolithic architecture.
13. Discuss what Hadoop is, explaining the core concepts of it
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.
- Hadoop Distributed File System
The most important two are the Distributed File System, which allows data to be stored in an easily accessible format, across a large number of linked storage devices, and the MapReduce - which provides the basic tools for poking around in the data.
- Hadoop YARN
The final module is YARN, which manages resources of the systems storing the data and running the analysis.
Various other procedures, libraries or features have come to be considered part of the Hadoop "framework" over recent years, but Hadoop Distributed File System, Hadoop Map Reduce, Hadoop Common and Hadoop YARN are the principle four.
- Hadoop Map Reduce
Map Reduce is named after the two basic operations this module carries out - reading data from the database, putting it into a format suitable for analysis , and performing mathematical operations i.e counting the number of males aged 30+ in a customer database.
14. Explain the concept of IR, identifying tools for IR
Information retrieval (IR) is the activity of obtaining information system resources relevant to an information need from a collection. Searches can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for metadata that describe data, and for databases of texts, images or sounds.
https://docs.oracle.com/cd/E51000_01/doc.120/e51029/adm_config_connect.htm#BRMSA96803
https://softwareengineering.stackexchange.com/questions/330135/crud-without-an-orm
No comments:
Post a Comment