Explain denormalization. why is it used
So that is why the denormalization is not the unnormalized data. Having redundant data can improve the performance in specific ways of database searches for a particular item. Know more about Data Preparation and related stuff here. Advantages of Denormalization Denormalization is used by the database managers to increase the performance of a database. Some of its advantages are: Minimizing the need for joins Reducing the number of tables Queries to be retrieved can be simpler.
Less likely to have bugs Precomputing derived values Reducing the number of relations Reducing the number of foreign keys in relation Data modification at the computing time and rather than at the select time Retrieving data is faster due to fewer joins. Disadvantages of Denormalization Although Data Denormalization can avoid some anomalies that can lead to the mismatch of the result, it may Slow down updates, although maybe speeding up retrievals. Make it more complex in others, although simplifying implementation.
Be inconsistent. Sacrifice flexibility. It also can Increase the size of relations. Make the update and insert codes harder to write. Involve Data redundancy which necessitates more storage. The data can be changed now in many places, so we have to be careful while adjusting the data to avoid data anomalies.
We can use triggers, transactions, or procedures to avoid such inconsistencies. Data Denormalization vs Normalization For all the benefits that normalizing data brings, just like anything else in information technology, there are tradeoffs and costs. A normalized relational database for even a small business could comprise hundreds of tables. For transactions, like purchases, inventory maintenance, personal data, this should not present many issues if data management is being handled through a front end application.
While normalized data is optimized for entity-level transactions, denormalized data is optimized for answering business questions and driving decision making. Denormalization has fewer rules about structure and not like normalization. Reporting and decision support is simplified through a minimum of aggregated tables versus extracting data in real-time through multiple table joins. The obvious fact is that data cannot be located arbitrarily in database systems. In this situation, it is necessary to properly design the relational database.
One way to intuitively construct a relational database is normalization. At the same time, normalization can be treated as a target in the relational database design phase. Sometimes, however, there are situations when it is necessary to depart from this rule and there may be a need to carry out the so-called Database Denormalization. In short, database denormalization is the combination of normalized tables into one. This is the implementation of controlled redundancy into the database to speed up operation on it.
As a result of having a large amount of data in a relational tables, joining these tables to obtain the information you need for your business can become too expensive. Therefore, one solution is to cross-check keys or columns between tables that are joined quite frequently. Consequently, the target table contains not only the data relevant to it, but also information from other tables. Of course, this solution involves the possibility of data redundancy within the tables, which in turn leads to a rapid increase in their size.
A common symptom is the possibility of data duplications. From this point of view, the database denormalization seems to be a kind of compromise. While in the case of database normalization, the aim is to simplify it to the maximum form of tables that are independently responsible for each data subject area, the denormalization is assigned to the specificity of browsing these data and this cannot be universally defined for each case. Its efficient functioning must be defined based on the business needs.
If you want to know how to implement it properly visit our Data Engineering Consultancy page. In case of a rapid increase in the amount of data in a database, denormalization brings tangible benefits, but it has several disadvantages.
As there is no need to use joins between tables, it is possible to extract the necessary information from one table, which automatically increases the speed of query execution.
Additionally, this solution saves memory. If the table is properly reorganized for the most common needs, you can extract data from only one table and not waste time looking for join keys. However, one should remember about data redundancy and update the query accordingly. You need to also be sure that the profit from using it outweighs any harm. There are a few situations when you definitely should think of denormalization:. Before going with it, though, consider other options, like query optimization and proper indexing.
Obviously, the biggest advantage of the denormalization process is increased performance. But we have to pay a price for it, and that price can consist of:. In the model below, I applied some of the aforementioned denormalization rules. The pink tables have been modified, while the light-blue table is completely new.
In a normalized model we could compute this data as units ordered — units sold — units offered — units written off. We would repeat the calculation each time a client asks for that product, which would be extremely time consuming. Of course, this simplifies the select query a lot. Both of them store values when the task was created. The reason is that both of these values can change during time.
0コメント