First Name*

Last Name*

Email ID

Phone*

College - Where did you study?*

One of the IITs

One of the NITs

One of the BITs

One of the IIITs

One of the NIDs

Agnel Charities' FR. C. Rodrigues Institute of Technology, Vashi, Navi Mumbai

Atal Bihari Vajpayee Indian Institute of Information Technology & Management Gwalior (IIIT)

B M S College of Engineering Basavanagudi,Bangalore(BMSCE)

B.R.A.C.T's Vishwakarma Institute of Information Technology, Kondhwa(VIIT)

Bansilal Ramnath Agarawal Charitable Trust's Vishwakarma Institute of Technology, Bibwewadi, Pune (VIT Pune)

Bhartiya Vidya Bhavan's Sardar Patel Institute of Technology , Andheri, Mumbai (SPIT)

Bhilai Institute of Technology, Bhilai House, Durg(BIT)

Bhilai Institute of Technology.

Birla Institute of Technology, Goa

Birla Institute of Technology, Hydrabad

Birla Institute of Technology, Mesra, Ranchi

Birla Institute of Technology, Pilani, Rajasthan

CHAITANYA BHARATHI INSTITUTE OF TECHNOLOGY(CBIT)

Coimbatore Institute Of Technology(CIT) (Autonomous)

College of Engineering, Pune (COEP)

CV Raman Global University

Dayananda Sagar College of Engineering Bangalore (DSCE)

Delhi Technological University, DTU Delhi

Desai University, (DDU), Nadiad

Dhirubhai Ambani Institute of Info. & Comm. Tech.,(DA-IICT)

Don Bosco Institute of Technology, Mumbai

Dr. Ambedkar Institute Of Technology Bangalore

Faculty Of Technology & Engineering(MSU), Vadodara

Faculty Of Technology And Engineering(GIA), Dharmsinh

Fr. Conceicao Rodrigues College of Engineering, Bandra,Mumbai

Garv Institute of Management & Technology.

Government College of Engineering, Amravati

Govt Engineering College, Bilaspur.

Govt Engineering College, Raipur.

Govt. Engineering College, Raipur (GEC Raipur)

IIIT Hyderabad

Indian Institute of Art and Design(IIAD), Delhi

Indian Institute of Engineering Science and Technology, Shibpur (IIEST Shibpur)

Indian Institute of Information Technology (IIIT) Pune

Indian Institute of Information Technology (IIIT)Kota, Rajasthan

Indian Institute of Information Technology Surat (IIIT)

Indian Institute of Information Technology(IIIT) Kilohrad, Sonepat, Haryana

Indian Institute of Information Technology(IIIT), Vadodara, Gujrat

Indian Institute of Information Technology, Design & Manufacturing, Kancheepuram (IIIT)

Indian Institute of Technology (BHU) Varanasi

Indian Institute of Technology (ISM) Dhanbad

Indian Institute of Technology Bhilai

Indian Institute of Technology Bhubaneswar

Indian Institute of Technology Bombay

Indian Institute of Technology Delhi

Indian Institute of Technology Dharwad

Indian Institute of Technology Gandhinagar

Indian Institute of Technology Goa

Indian Institute of Technology Guwahati

Indian Institute of Technology Hyderabad

Indian Institute of Technology Indore

Indian Institute of Technology Jammu

Indian Institute of Technology Jodhpur

Indian Institute of Technology Kanpur

Indian Institute of Technology Kharagpur

Indian Institute of Technology Madras

Indian Institute of Technology Mandi

Indian Institute of Technology Palakkad

Indian Institute of Technology Patna

Indian Institute of Technology Roorkee

Indian Institute of Technology Ropar

Indian Institute of Technology Tirupati

Indraprastha Institute of Information Technology Delhi (IIIT-Delhi)

INSTITUTE OF ENGINEERING & TECHNOLOGY,LUCKNOW (0052)(IET Lucknow)

Institute of Engineering and Management, Kolkata

Institute of Engineering and Technology, DAVV, Indore (1996)

Institute Of Technology, Nirma University Of Science & Technology, Ahmedabad

International Institute of Information Technology, Bhubaneswar

International Institute of Information Technology, Naya Raipur

Jabalpur Engineering College, Jabalpur, (JEC) (1947)

Jadavpur Uni

Jadavpur University

JSS Science and Technology University(Formerly SJCE) Mysore

K J Somaiya Institute of Engineering and Information Technology, Sion, Mumbai

K.J.Somaiya College of Engineering, Vidyavihar, Mumbai

Kalinga Institute of Industrial Technology

L.D.College Of Engineering, Ahmedabad (LDCE)

M S Ramaiah Institute of Technology Bangalore (MSRIT)

Madhav Institute of Technology & Science, Gwalior (1957)

MAEER’S MIT, Pune

Maharashtra Academy of Engineering and Educational Research

Maharashtra Institute of Technology (MIT)

Malaviya National Institute of Technology Jaipur

Manipal Institute of Technology (MIT)

Maulana Abul Kalam Azad University of Technology, Kolkata

Maulana Azad National Institute of Tehnology Bhopal

MIT Academy of Engineering,Alandi, Pune

MKSSS's Cummins College of Engineering for Women, Karvenagar,Pune

Motilal Nehru National Institute of Technology Allahabad

National Institute of Design(NID)

National Institute of Technology Calicut

National Institute of Technology Delhi

National Institute of Technology Durgapur

National Institute of Technology Hamirpur

National Institute of Technology Jalandhar

National Institute of Technology Karnataka, Surathkal

National Institute of Technology Patna

National Institute of Technology Raipur

National Institute of Technology, Andhra Pradesh

National Institute of Technology, Jamshedpur

National Institute of Technology, Kurukshreta

National Institute of Technology, Rourkela

National Institute of Technology, Silchar

National Institute of Technology, Tiruchirappalli

National Institute of Technology, Warangal

Netaji Subhas University of Technology, New Delhi (NSUT Delhi)

O U COLLEGE OF ENGG HYDERABAD (UCE)

P E S University (Electronic City Campus) Bangalore(PES)

P E S University (Ring Road Campus) Bangalore(PES)

Pandit Deendayal Petroleum University ,Gandhinagar(PDPU)

Pimpri Chinchwad Education Trust, Pimpri Chinchwad College of Engineering, Pune(PCCOE)

PSG College of Engineering and Technology

Pt. Dwarka Prasad Mishra Indian Institute of Information Technology, Design & Manufacture Jabalpur

Pune Institute of Computer Technology, Dhankavdi, Pune(PICT)

Punjab Engineering College, Chandigarh (PEC)

R. V. College of Engineering Bangalore(RVCE)

Sardar Patel Institute of Technology, Andheri, Mumbai

Sardar Vallabhbhai National Institute of Technology, Surat

School of Engineering and Applied Science, Ahmedabad (SEAS)

Shri G.S. Institute of Technology & Science, Indore (M.P.) (1952)

Shri Guru Gobind Singhji Institute of Engineering and Technology, Nanded

Shri Shankaracharya Technical Campus,(Shri Shankaracharya Group of Institutions).

Shri Vile Parle Kelvani Mandal's Dwarkadas J. Sanghvi College of Engineering, Vile Parle,Mumbai (DJSCE)

Silicon Institute of Technology

Sir M.Visveswaraya Institute of Technology Hunasemaranahalli,Bangalore,

SOA ITER, Bhubaneshwar

Sri Jayachamarajendra College of Engineering(Const. of JSS Univ.) Mysore

Sri Sivasubramaniya Nadar College Of Engg (Autonomous) (SSN)

Srishti Institute of Art and Design, Bangaluru

SSN CoE, Kalavakkam

Symbiosis Institute of Design(SID),Pune

The National Institute of Engineering Mysore (NIE)

Thiagarajar College Of Engineering (Autonomous) (TCE)

University Institute of Technology RGPV, Bhopal (1986)

University of Kalyani, Kalyani

University Visveswariah College of Engineering Bangalore (UVCE)

VASAVI COLLEGE OF ENGINEERING (VCE)

Veer Surendra Sai University of Technology

Veermata Jijabai Technological Institute(VJTI), Matunga, Mumbai

Vellore Institute of Technology(VIT Vellore)

Vidyalankar Institute of Technology,Wadala, Mumbai

Vishwakarma Government Engineering College, Chandkheda,Gandhinagar (VGECG)

Visvesvaraya National Institute of Technology, Nagpur

Vivekanand Education Society's Institute of Technology, Chembur, Mumbai

Walchand College of Engineering, Sangli (WCE)

Field of Study (Graduation)*

BTech

BDES/MDES

BCA

BSc

Others

Upload your CV*

Yes, I would like Talentica Software to contact me. Click here to read our full Privacy Policy.

First Name*

Last Name*

Email ID

Phone*

Message

Yes, I would like Talentica Software to contact me. Click here to read our full Privacy Policy.

Partitioning Database - A Divide and Rule Strategy

June 02, 2021

Priya Kar

Application Architect

June 02, 2021

Priya Kar

Application Architect

https://github.com/priyakartalentica/postgresqlPartitioningTable.git

Introduction

Imagine a dump yard full of scrapped cars. If you have to find a particular Ford Mustang there, you might end up spending days before locating the right one. Now, think of a trip to the Walmart. If you have to find a needle there, it will hardly take a few minutes. Why so? The right answer is proper partitioning. Such segregation is a must for effective operation.

The same is true for data and database. High volume of data leads to slower read and write. Read and write status improves when you implement partitioning well.

What is Partitioning?

Partitioning is dividing the large grown tables into physical pieces. So given the situation, the table could grow either horizontally or vertically.

When to Partition a table

The precise point that ensures benefits from partitioning depends on the application design. General strategies to take partitioning decision are as follows:

The size of the table has grown huge.
As a rule of thumb, the size of the table should exceed the database server’s physical memory

Types of Partitioning

Vertical Partitioning

If your table has grown fat, i.e., there are too many columns which might be a major reason for slower writes, then you have to think whether the columns are needed in one single table or can be normalized. Such a partitioning process is also known as “row splitting”.

Horizontal Partitioning

If your table has grown tall with a huge number of records, it will consume high table scan time to fetch records. For such cases, indexing might be a good solution. Indexing stores pointers to unique elements so that you can quickly reach the data. Just like the index section of a book where you can search the keyword for the page number. It speeds up the process of getting hands on the content you want. But with a growing number of records, your operation slows down. Consider a hypothetical situation where the Glossary (Index) grows so huge that it starts to consume more of your processing time. A possible way out would be dividing the book into logical volumes. Similarly, when your table grows massive, think of sharding, which is a part of the horizontal partitioning strategy.

It creates replicas of the schema and then divides the data stored in each partition based on keys. This requires the load to be distributed and spaced evenly across shards based on data-access patterns and space considerations. Horizontal partitioning requires the classification of different rows into different tables. It can be further classified as follow:

If you have user data like Facebook or LinkedIn, you might prefer to partition it based on regions or a list of cities in a region which is a List-based partitioning strategy.

In the case of a table storing sales data, you can partition it “month-wise,” which is Range-based partitioning. Range-based partitioning maps the data to different partitions based on ranges of values of the partitioning key that you establish for each partition.

Benefits of Partitioning

1. You can radically improve the query performance by storing frequently accessed rows or groups of rows in similar partitions or a small number of partitions.
2. Partitioning can improve the performance by using a sequential scan of a partition instead of using an index and random access reads scattered across the whole table in certain scenarios when queries or updates access a large percentage of a single partition.
3. If you are serving use-cases where Bulk loads and deletes are required (based on partitioning criteria), it can be efficiently done by adding or removing partitions.
4. Less-used data can be migrated to cheaper and slower storage media thus saving cost.

Problem Statement / POC for Horizontal List based partitioning

Recently, we were in a situation where the reads and writes were extremely slow for a PostgreSQL DB. On examining, we found that the table had around 10 million rows. We replicated the scenario and carried out a benchmarking exercise to see how each use-case behaves, given each table has 10 million records. Following are the list of use-cases we tested for:

1. Normal table without indexes
2. Table will Indexes
3. Partitioned table without Indexes

Steps to Carry out the Benchmarking Exercise

1. Started a docker instance to have PostgreSQL and pgAdmin running. docker-compose.yml file is present in the GitHub repo to spawn a similar instance for you.
2. We created 3 tables (Normal, Indexed, and Partitioned table). Each with 10 million records. Scripts to create the tables and insert 10 million records are present in the GitHub repo link mentioned below.
3. Each table had the following fields:
  1. item_id
  2. item_name
  3. store_id
  4. category_id
  5. country_id
  6. retailor_id
  7. score
4. Data Distribution in Table: 10 million items were evenly distributed across 10 categories, 50 stores, and 20 retailers across 5 countries.
5. Background of the client – Our client had a chain of stores wherein customers can ask regarding items in a store inventory.
6. Partitioning strategy – We observed that the data was a bit evenly distributed across stores. Even customers visiting the stores would query the data for a particular store. So, in our scenario, the store id was considered as the partition key.
7. Ran the following queries against each table using pgAdmin. List of Queries are as follows:
  1. 1. select avg(score) where store_id=?
    2. insert 100 records for a store_id
    3. update 100 records for a store_id
    4. select avg(score) where category_id=?
    5. insert 1000 records for a store_id
    6. update 1000 records for a store_id
    7. insert 10000 records for a store_id

update 10000 records for a store_idBenchmarking Observations

From the Graph and the reading in the table it is very clear that Partitioned tables functioned better than indexed tables under higher data loads. And it is an eye opening observation that indexing can slow down your system when the data load increased and might trigger full memory scan itself.

Reads Operations

For instance, when every store has around 20K records and every category has around 2 Lakh records, performing an aggregate function table takes 655ms and the indexed table takes 125ms. At the same time, a partitioned table outperforms others by completing the task in 97ms.

While performing an aggregate function for 2 lakh records, a normal table takes 279 ms and the indexed table fetches result in 62 ms. There, the partition table scores marginally well by providing the output within 57ms. The numbers will grow in sync with the data load.

So in terms of reading, opting for a Partition table will be a good decision with the given underlying scenario, followed by Indexing, and the least preferred will be the normal table without indexes in terms of Reading data.

Inserts and Updates (Write Operations)

We carried out the write operation, which consists of inserts and updates in a batch of 100, 1000, and 10000 items. Each insert/update was performed and stored for random categories having random scores so that PostgreSQL doesn’t introduce its optimizations.

After carrying out the inserts, we observed that the normal table is performing better than indexed tables and partitioning outperformed with lower response time for bulk or higher load. It resulted in faster writes as the partitioning was done based on the store. We realized that all the data needs to be written for a single store, i.e., in one partitioning and without the overhead of maintaining indexes.

In short, partitioning performed way better than the indexed table in terms of reads and writes with the growing data load.

Conclusion

Partitioning divides logical data elements into multiple entities to improve performance, availability, and maintainability. You can make the decision based on the application type. Also, go for it if your tables grow too large and none of the optimization techniques work anymore. Under heavy data load, partitioning will improve the read response time and function better in terms of writes.